Skip to contents

Function written in C++ for estimating the parameters of the model via Expectation Maximization (EM Algorithm).

Usage

EM_algo_c(
  obs_pattern_shape,
  obs_pattern_frq,
  obs_internal_count,
  categorical_levels,
  random_starts,
  max_iterations,
  rel_convergence,
  con_step_size,
  con_random_starts,
  con_max_iterations,
  con_rel_convergence,
  fast,
  trace,
  con_trace
)

Arguments

obs_pattern_shape

Matrix containing the unique patterns found in the data. Ideally this matrix is generated by the function get_patterns().

obs_pattern_frq

Vector containing the frequencies of the patterns. Ideally it is generated by the the function get_patterns().

obs_internal_count

Matrix containing the relative frequencies of each category within each pattern. Ideally this matrix is generated by the function get_patterns().

categorical_levels

Vector containing all possible categories of the content analysis.

random_starts

Integer for determining how often the algorithm should restart with randomly chosen values for the Assignment Error Matrix and the categorical sizes.

max_iterations

Integer for determining the maximum number of iterations for each random start.

rel_convergence

Double for determining the convergence criterion. The algorithm stops if the relative change is smaller than this criterion.

con_step_size

Double for specifying the size for increasing or decreasing the probabilities during the condition stage of estimation. This value should not be less than 1e-3.

con_random_starts

Integer for the number of random starts within the condition stage.

con_max_iterations

Integer for the maximum number of iterations during the condition stage.

con_rel_convergence

Double for determining the convergence criterion during condition stage. The algorithm stops if the relative change is smaller than this criterion.

fast

Bool If TRUE a fast estimation is applied during the condition stage. This option ignores all parameters beginning with "con_". If FALSE the estimation described in Berding and Pargmann (2022) is used. Default is TRUE.

trace

TRUE for printing progress information on the console. FALSE if this information should not be printed.

con_trace

TRUE for printing progress information on the console during estimations in the condition stage. FALSE if this information should not be printed.

Value

Function returns a list with the estimated parameter sets for every random start. Every parameter set contains the following components:

log_likelihood

Log likelihood of the estimated solution.

aem

Estimated Assignment Error Matrix (aem). The rows represent the true categories while the columns stand for the assigned categories. The cells describe the probability that a coding unit of category i is assigned to category j.

categorial_sizes

Vector of estimated sizes for each category.

convergence

If the algorithm converged within the iteration limit TRUE. FALSE in every other case.

iteration

Number of iterations when the algorithm was terminated.

References

Berding, Florian, and Pargmann, Julia (2022).Iota Reliability Concept of the Second Generation.Measures for Content Analysis Done by Humans or Artificial Intelligences. Berlin: Logos. https://doi.org/10.30819/5581