Create synthetic cases for balancing training data
Source:R/utils_DataManager.R
get_synthetic_cases_from_matrix.RdThis function creates synthetic cases for balancing the training with classifier models.
Usage
get_synthetic_cases_from_matrix(
matrix_form,
times,
features,
target,
sequence_length,
method = "knnor",
min_k = 1L,
max_k = 6L
)Arguments
- matrix_form
Named
matrixcontaining the text embeddings in a matrix form.- times
intfor the number of sequences/times.- features
intfor the number of features within each sequence.- target
Named
factorcontaining the labels of the corresponding embeddings.- sequence_length
intLength of the text embedding sequences.- method
vectorcontaining strings of the requested methods for generating new cases. Currently "knnor" from this package is available.- min_k
intThe minimal number of nearest neighbors during sampling process.- max_k
intThe maximum number of nearest neighbors during sampling process.
Value
list with the following components:
syntetic_embeddings: Nameddata.framecontaining the text embeddings of the synthetic cases.syntetic_targets: Namedfactorcontaining the labels of the corresponding synthetic cases.n_syntetic_units:tableshowing the number of synthetic cases for every label/category.