Text embedding classifier with a neural net
Source:R/obj_TEClassifierRegular.R
TEClassifierRegular.RdAbstract class for neural nets with 'pytorch'.
This class is deprecated. Please use an Object of class TEClassifierSequential instead.
Value
Objects of this class are used for assigning texts to classes/categories. For the creation and training of a classifier an object of class EmbeddedText or LargeDataSetForTextEmbeddings on the one hand and a factor on the other hand are necessary.
The object of class EmbeddedText or LargeDataSetForTextEmbeddings contains the numerical text representations (text embeddings) of the raw texts generated by an object of class TextEmbeddingModel. For supporting large data sets it is recommended to use LargeDataSetForTextEmbeddings instead of EmbeddedText.
The factor contains the classes/categories for every text. Missing values (unlabeled cases) are supported and can
be used for pseudo labeling.
For predictions an object of class EmbeddedText or LargeDataSetForTextEmbeddings has to be used which was created with the same TextEmbeddingModel as for training.
Note
This model requires pad_value=0. If this condition is not met the
padding value is switched automatically.
Super classes
aifeducation::AIFEMaster -> aifeducation::AIFEBaseModel -> aifeducation::ModelsBasedOnTextEmbeddings -> aifeducation::ClassifiersBasedOnTextEmbeddings -> aifeducation::TEClassifiersBasedOnRegular -> TEClassifierRegular
Methods
Inherited methods
aifeducation::AIFEMaster$get_all_fields()aifeducation::AIFEMaster$get_documentation_license()aifeducation::AIFEMaster$get_ml_framework()aifeducation::AIFEMaster$get_model_config()aifeducation::AIFEMaster$get_model_description()aifeducation::AIFEMaster$get_model_info()aifeducation::AIFEMaster$get_model_license()aifeducation::AIFEMaster$get_package_versions()aifeducation::AIFEMaster$get_private()aifeducation::AIFEMaster$get_publication_info()aifeducation::AIFEMaster$get_sustainability_data()aifeducation::AIFEMaster$is_configured()aifeducation::AIFEMaster$is_trained()aifeducation::AIFEMaster$set_documentation_license()aifeducation::AIFEMaster$set_model_description()aifeducation::AIFEMaster$set_model_license()aifeducation::AIFEMaster$set_publication_info()aifeducation::AIFEBaseModel$count_parameter()aifeducation::ModelsBasedOnTextEmbeddings$get_text_embedding_model()aifeducation::ModelsBasedOnTextEmbeddings$get_text_embedding_model_name()aifeducation::ClassifiersBasedOnTextEmbeddings$adjust_target_levels()aifeducation::ClassifiersBasedOnTextEmbeddings$check_embedding_model()aifeducation::ClassifiersBasedOnTextEmbeddings$check_feature_extractor_object_type()aifeducation::ClassifiersBasedOnTextEmbeddings$load_from_disk()aifeducation::ClassifiersBasedOnTextEmbeddings$plot_coding_stream()aifeducation::ClassifiersBasedOnTextEmbeddings$plot_training_history()aifeducation::ClassifiersBasedOnTextEmbeddings$predict()aifeducation::ClassifiersBasedOnTextEmbeddings$requires_compression()aifeducation::ClassifiersBasedOnTextEmbeddings$save()aifeducation::TEClassifiersBasedOnRegular$train()
Method configure()
Creating a new instance of this class.
Usage
TEClassifierRegular$configure(
name = NULL,
label = NULL,
text_embeddings = NULL,
feature_extractor = NULL,
target_levels = NULL,
bias = TRUE,
dense_size = 4L,
dense_layers = 0L,
rec_size = 4L,
rec_layers = 2L,
rec_type = "GRU",
rec_bidirectional = FALSE,
self_attention_heads = 0L,
intermediate_size = NULL,
attention_type = "Fourier",
add_pos_embedding = TRUE,
act_fct = "ELU",
parametrizations = "None",
rec_dropout = 0.1,
repeat_encoder = 1L,
dense_dropout = 0.4,
encoder_dropout = 0.1
)Arguments
namestringName of the new model. Please refer to common name conventions. Free text can be used with parameterlabel. If set toNULLa unique ID is generated automatically. Allowed values: anylabelstringLabel for the new model. Here you can use free text. Allowed values: anytext_embeddingsEmbeddedText, LargeDataSetForTextEmbeddingsObject of class EmbeddedText or LargeDataSetForTextEmbeddings.feature_extractorTEFeatureExtractorObject of class TEFeatureExtractor which should be used in order to reduce the number of dimensions of the text embeddings. If no feature extractor should be applied setNULL.target_levelsvectorcontaining the levels (categories or classes) within the target data. Please note that order matters. For ordinal data please ensure that the levels are sorted correctly with later levels indicating a higher category/class. For nominal data the order does not matter.biasboolIfTRUEa bias term is added to all layers. IfFALSEno bias term is added to the layers.dense_sizeintNumber of neurons for each dense layer. Allowed values:1 <= xdense_layersintNumber of dense layers. Allowed values:0 <= xrec_sizeintNumber of neurons for each recurrent layer. Allowed values:1 <= xrec_layersintNumber of recurrent layers. Allowed values:0 <= xrec_typestringType of the recurrent layers.rec_type='GRU'for Gated Recurrent Unit andrec_type='LSTM'for Long Short-Term Memory. Allowed values: 'GRU', 'LSTM'rec_bidirectionalboolIfTRUEa bidirectional version of the recurrent layers is used.self_attention_headsintdetermining the number of attention heads for a self-attention layer. Only relevant ifattention_type='multihead'Allowed values:0 <= xintermediate_sizeintdetermining the size of the projection layer within a each transformer encoder. Allowed values:1 <= xattention_typestringChoose the attention type. Allowed values: 'Fourier', 'MultiHead'add_pos_embeddingboolTRUEif positional embedding should be used.act_fctstringActivation function for all layers. Allowed values: 'ELU', 'LeakyReLU', 'ReLU', 'GELU', 'Sigmoid', 'Tanh', 'PReLU'parametrizationsstringRe-Parametrizations of the weights of layers. Allowed values: 'None', 'OrthogonalWeights', 'WeightNorm', 'SpectralNorm'rec_dropoutdoubledetermining the dropout between recurrent layers. Allowed values:0 <= x <= 0.6repeat_encoderintdetermining how many times the encoder should be added to the network. Allowed values:0 <= xdense_dropoutdoubledetermining the dropout between dense layers. Allowed values:0 <= x <= 0.6encoder_dropoutdoubledetermining the dropout for the dense projection within the transformer encoder layers. Allowed values:0 <= x <= 0.6