Function for splitting data into a train and validation sample
Source:R/aux_fct.R
get_train_test_split.Rd
This function creates a train and validation sample based on stratified random sampling. The relative frequencies of each category in the train and validation sample equal the relative frequencies of the initial data (proportional stratified sampling).
Arguments
- embedding
Object of class EmbeddedText.
- target
Named
factor
containing the labels of every case.- val_size
double
Ratio between 0 and 1 indicating the relative frequency of cases which should be used as validation sample.
Value
Returns a list
with the following components.
target_train:
Namedfactor
containing the labels of the training sample.embeddings_train:
Object of class EmbeddedText containing the text embeddings for the training sampletarget_test:
Namedfactor
containing the labels of the validation sample.embeddings_test:
Object of class EmbeddedText containing the text embeddings for the validation sample
See also
Other Auxiliary Functions:
array_to_matrix()
,
calc_standard_classification_measures()
,
check_embedding_models()
,
clean_pytorch_log_transformers()
,
create_iota2_mean_object()
,
create_synthetic_units()
,
generate_id()
,
get_coder_metrics()
,
get_folds()
,
get_n_chunks()
,
get_stratified_train_test_split()
,
get_synthetic_cases()
,
is.null_or_na()
,
matrix_to_array_c()
,
split_labeled_unlabeled()
,
summarize_tracked_sustainability()
,
to_categorical_c()