Skip to contents

This function creates a train and validation sample based on stratified random sampling. The relative frequencies of each category in the train and validation sample equal the relative frequencies of the initial data (proportional stratified sampling).

Usage

get_train_test_split(embedding = NULL, target, val_size)

Arguments

embedding

Object of class EmbeddedText.

target

Named factor containing the labels of every case.

val_size

double Ratio between 0 and 1 indicating the relative frequency of cases which should be used as validation sample.

Value

Returns a list with the following components.

  • target_train: Named factor containing the labels of the training sample.

  • embeddings_train: Object of class EmbeddedText containing the text embeddings for the training sample

  • target_test: Named factor containing the labels of the validation sample.

  • embeddings_test: Object of class EmbeddedText containing the text embeddings for the validation sample