Function for splitting data into a train and validation sample

This function creates a train and validation sample based on stratified random sampling. The relative frequencies of each category in the train and validation sample equal the relative frequencies of the initial data (proportional stratified sampling).

Usage

get_train_test_split(embedding = NULL, target, val_size)

Arguments

embedding: Object of class EmbeddedText.
target: Named factor containing the labels of every case.
val_size: double Ratio between 0 and 1 indicating the relative frequency of cases which should be used as validation sample.

Value

Returns a list with the following components.

target_train: Named factor containing the labels of the training sample.
embeddings_train: Object of class EmbeddedText containing the text embeddings for the training sample
target_test: Named factor containing the labels of the validation sample.
embeddings_test: Object of class EmbeddedText containing the text embeddings for the validation sample

Usage

Arguments

Value

See also