Create synthetic cases for balancing training data
Source:R/utils_DataManager.R
get_synthetic_cases_from_matrix.RdThis function creates synthetic cases for balancing the training with classifier models.
Usage
get_synthetic_cases_from_matrix(
matrix_form,
times,
features,
target,
sequence_length,
method = "knnor",
min_k = 1L,
max_k = 6L
)Arguments
- matrix_form
Named
matrixcontaining the text embeddings in a matrix form.- times
intfor the number of sequences/times.- features
intfor the number of features within each sequence.- target
Named
factorcontaining the labels of the corresponding embeddings.- sequence_length
intLength of the text embedding sequences.- method
vectorcontaining strings of the requested methods for generating new cases. Currently "knnor" from this package is available.- min_k
intThe minimal number of nearest neighbors during sampling process.- max_k
intThe maximum number of nearest neighbors during sampling process.
Value
list with the following components:
syntetic_embeddings: Nameddata.framecontaining the text embeddings of the synthetic cases.syntetic_targets: Namedfactorcontaining the labels of the corresponding synthetic cases.n_syntetic_units:tableshowing the number of synthetic cases for every label/category.
See also
Other Utils Developers:
auto_n_cores(),
create_object(),
create_synthetic_units_from_matrix(),
generate_id(),
get_n_chunks(),
get_time_stamp(),
matrix_to_array_c(),
tensor_to_matrix_c(),
to_categorical_c()