Next: Another mixture example: ex_bic Up: Models with multivariate Gaussian Previous: Data structures

Subsections



Simple types: ex_basic

The script ex_basic.m contains some code corresponding to low dimensional examples of the three basic HMM types that are handled by H2M.

Ergodic model with full covariance matrices

Let X denote a matrix containing T observed vectors, the EM estimation of the parameters take the following form:
for i = 1:n_iter
  [alpha, beta, logscale, dens] = hmm_fb(X, A, pi0, mu, Sigma);
  logl(i) = log(sum(alpha(T,:))) + logscale;
  [A, pi0] = hmm_tran(alpha, beta, dens, A, pi0);
  [mu, Sigma] = hmm_dens(X, alpha, beta, COV_TYPE);
end;
COV_TYPE is just a flag that should be set to 0 when using full covariance matrices and to 1 for diagonal covariance matrices.

Notice that at each step, the log-likelihood is computed from the forward variables using a correction term logscale returned by hmm_fb (for forward-backward) which contains the sum of the logarithmic scaling factors used during the computation of alpha and beta. In the present version scaling of the forward variable is performed at each time index t = 2:T (which means that each row of the alpha matrix sums to one, except the first one). This systematic scaling appears to be much safer when using input data with largely varying range. The backward variable is scaled using the same normalization factors as indicated in [4] (using exactly the same normalization factors is important for the re-estimation of the coefficients of the transition matrix). Note that the mere suppression of the scaling procedure would lead to numerical problems in almost every cases of interest (when the length of the observation sequences if greater than 50 for instance) despite the double precision representation used in MATLAB/OCTAVE.

If you don't want to see what's going on you can simply use

[A, pi0, mu, Sigma, logl] = hmm(X, A, pi0, mu, Sigma, n_iter);
which calls the very same piece of code except for the fact that the messages concerning the execution time are suppressed: all the computational functions print those messages by default but this can be suppressed by supplying and optional argument (named QUIET) different from zero.

Left-right HMM

This case is not really different from the last one except that the case of multiple observation sequences has been considered:
for i = 1:n_iter
  [A, logl(i), gamma] = hmm_mest(X, st, A, mu, Sigma);
  [mu, Sigma] = mix_par(X, gamma, COV_TYPE);
end;
In this case, the matrix X contains all the observation sequences and the vector st yields the index corresponding to the beginning of each sequence so that X(1:st(2)-1,:) contains the vectors that correspond to the first observation sequence, and so on until X(st(length(st)),length(X(1,:),:) which corresponds to the last observation sequence.

The transition parameters are re-estimated inside hmm_mest and the a posteriori distribution of the states are returned in gamma. Once again, if you don't want to see what's happening you could more simply use

[A, mu, Sigma, logl] = hmm(X, st, A, mu, Sigma, n_iter);

In fact, if you need to estimate the parameter of an ergodic model using multiple (independent) observation sequences, you may use hmm_mest as well, but hmm_mest assumes that pi0 = [1 0 ... 0] (ie. that the Markov chain starts from the first state).

Mixture model

The EM estimation in the case of mixture model is achieved through
for i = 1:n_iter
  [gamma, logl(i)] = mix_post(X, w, mu, Sigma);
  [mu, Sigma, w] = mix_par(X, gamma, COV_TYPE);
end;
or, more simply
[w, mu, Sigma, logl] = mix(X, w, mu, Sigma, n_iter);


Next: Another mixture example: ex_bic Up: Models with multivariate Gaussian Previous: Data structures

Olivier Cappé, Aug 24 2001