Create synthetic cases for balancing training data

This function creates synthetic cases for balancing the training with an object of the class TextEmbeddingClassifierNeuralNet.

Usage

get_synthetic_cases(
  embedding,
  times,
  features,
  target,
  method = c("smote"),
  max_k = 6
)

embedding: Named data.frame containing the text embeddings. In most cases, this object is taken from EmbeddedText$embeddings.
times: int for the number of sequences/times.
features: int for the number of features within each sequence.
target: Named factor containing the labels of the corresponding embeddings.
method: vector containing strings of the requested methods for generating new cases. Currently "smote","dbsmote", and "adas" from the package smotefamily are available.
max_k: int The maximum number of nearest neighbors during sampling process.

list with the following components.

syntetic_embeddings: Named data.frame containing the text embeddings of the synthetic cases.
syntetic_targets Named factor containing the labels of the corresponding synthetic cases.
n_syntetic_units table showing the number of synthetic cases for every label/category.