1) Installation and Technical Requirements
Introduction
Several packages allow users to use machine learning directly in R such as nnet for single layer neural nets, rpart for decision trees, and ranger for random forests. Furthermore, with mlr3verse a series of packages exists for managing different algorithms with a unified interface.
These packages can be used with a ‘normal’ computer and provide an easy installation. In terms of natural language processing, these approaches are currently limited. State-of-the-art approaches rely on neural nets with multiple layers and consist of a huge number of parameters making them computationally demanding. With specialized libraries such as keras, PyTorch and tensorflow, graphical processing units (gpu) can help to speed up computations significantly. However, many of these specialized libraries for machine learning are written in python. Fortunately, an interface to python is provided via the R package reticulate.
The R package Artificial Intelligence for Education (aifeducation) aims to provide educators, educational researchers, and social researchers with a convincing interface to these state-of-the-art models for natural language processing and tries to address the special needs and challenges of the educational and social sciences. The package currently supports the application of Artificial Intelligence (AI) for tasks such as text embedding and classification.
Since state-of-the-art approaches in natural language processing rely on large models compared to classical statistical methods (e.g., latent class analysis, structural equation modeling) and are based largely on python, some additional installation steps are necessary.
If you would like to train and to develop your own models and AIs, a compatible graphic device is necessary. Even a low performing graphic device can speed up computations significantly. If you prefer to use pre-trained models however, this is not necessary. In this case, a ‘normal’ office computer without a graphic device should be sufficient in most cases. To get ready for using the package, two steps are necessary.
Step 1 - Install the R package
First you need to install the package. This can be done by:
install.packages("aifeducation")
With this command, aifeducation
is installed on your
machine.
Step 2 - Install Python and optional R packages
Since natural language processing with neural nets is based on models which are computationally intensive, PyTorch is used within this package together with some other specialized python libraries.
The most straightforward method for getting started is to call the
function install_aifeducation
as follows:
install_aifeducation(
install_aifeducation_studio = TRUE
)
This function will install python, miniconda, and all relevant python libraries into a conda environment called “aifeducation”.
In addition, we recommend to set
install_aifeducation_studio=TRUE
since this will install
the optional R packages necessary to use AI for Education -
Studio. AI for Education - Studio is a graphical user
interface for applying these packages. We recommend to use it for
everyone who is unfamiliar with R or machine learning. If you
do not want use the studio, you can set the argument to
FALSE
. In this case you have to use the package with
R syntax.
If you have a suitable machine and would like to use a graphic card for computations you need to install some further software. You can find further information here: https://pytorch.org/get-started/locally/
You can check if python is working by using the function
reticulate::py_available()
. This should return
TRUE
.
reticulate::py_available(initialize = TRUE)
You can check if all necessary python packages are successfully
installed by calling the function check_aif_py_modules
aifeducation::check_aif_py_modules()
Now everything is ready to use the package.
2) Starting a new session
The most convenient way to work with the package is to use AI for
Education - Studio which you can start by calling
aifeducation::start_aifeducation_studio()
.
In case you do not want to use the graphical user interface, you have
to prepare your R sessions. First, it is necessary that you set
up python via ‘reticulate’ and choose the conda environment where all
necessary python libraries are available. Then you can load
aifeducation
. In case you installed python as suggested in
this vignette, you may start a new session like this:
reticulate::use_condaenv(condaenv = "aifeducation")
library(aifeducation)
Note: Please remember: Every time you start a new session in R, you have to to set the correct conda environment and load the library
aifeducation
. This is not necessary if you use AI for Education - Studio.
3) Tutorials and Guides
A guide on how to use the graphical user interface can be found in vignette 02 Using the graphical user interface Aifeducation - Studio.
A short introduction into the package with examples for classification tasks can be found in vignette 03 Using R syntax.
Documenting and sharing your work is described in vignette 04 Sharing and using trained AI/models.
4) Update aifeducation
In case you already use aifeducation and you want to update
to a newer version of this package, it is recommended to update the used
python libraries as well. The easiest way is to remove the conda
environment “aifeducation” and to install the libraries into a fresh
environment. This can be done by setting remove_first=TRUE
in install_py_modules
.
aifeducation::install_py_modules(
remove_first = TRUE
)