Skip to contents

This function creates a stratified random sample.The difference to get_train_test_split is that this function does not require text embeddings and does not split the text embeddings into a train and validation sample.

Usage

get_stratified_train_test_split(targets, val_size = 0.25)

Arguments

targets

Named vector containing the labels/categories for each case.

val_size

double Value between 0 and 1 indicating how many cases of each label/category should be part of the validation sample.

Value

list which contains the names of the cases belonging to the train sample and to the validation sample.