ex_basic.m
contains some code corresponding to low
dimensional examples of the three basic HMM types that are handled by H2M.
X
denote a matrix containing T
observed vectors, the EM
estimation of the parameters take the following form:
for i = 1:n_iter [alpha, beta, logscale, dens] = hmm_fb(X, A, pi0, mu, Sigma); logl(i) = log(sum(alpha(T,:))) + logscale; [A, pi0] = hmm_tran(alpha, beta, dens, A, pi0); [mu, Sigma] = hmm_dens(X, alpha, beta, COV_TYPE); end;
COV_TYPE
is just a flag that should be set to 0 when using full
covariance matrices and to 1 for diagonal covariance matrices.
Notice that at each step, the log-likelihood is computed from the forward
variables using a correction term logscale
returned by hmm_fb
(for forward-backward) which contains the sum of the logarithmic scaling
factors used during the computation of alpha
and beta
. In the
present version scaling of the forward variable is performed at each time index
t = 2:T
(which means that each row of the alpha
matrix sums to
one, except the first one). This systematic scaling appears to be much safer
when using input data with largely varying range. The backward variable is
scaled using the same normalization factors as indicated in [4]
(using exactly the same normalization factors is important for the
re-estimation of the coefficients of the transition matrix). Note that the mere
suppression of the scaling procedure would lead to numerical problems in almost
every cases of interest (when the length of the observation sequences if
greater than 50 for instance) despite the double precision representation used
in MATLAB/OCTAVE.
If you don't want to see what's going on you can simply use
[A, pi0, mu, Sigma, logl] = hmm(X, A, pi0, mu, Sigma, n_iter);which calls the very same piece of code except for the fact that the messages concerning the execution time are suppressed: all the computational functions print those messages by default but this can be suppressed by supplying and optional argument (named
QUIET
) different from zero.
for i = 1:n_iter [A, logl(i), gamma] = hmm_mest(X, st, A, mu, Sigma); [mu, Sigma] = mix_par(X, gamma, COV_TYPE); end;In this case, the matrix
X
contains all the observation sequences and
the vector st
yields the index corresponding to the beginning of each
sequence so that X(1:st(2)-1,:)
contains the vectors that correspond to
the first observation sequence, and so on until
X(st(length(st)),length(X(1,:),:)
which corresponds to the last
observation sequence.
The transition parameters are re-estimated inside hmm_mest
and the a
posteriori distribution of the states are returned in gamma
. Once again,
if you don't want to see what's happening you could more simply use
[A, mu, Sigma, logl] = hmm(X, st, A, mu, Sigma, n_iter);
In fact, if you need to estimate the parameter of an ergodic model using
multiple (independent) observation sequences, you may use hmm_mest
as
well, but hmm_mest
assumes that pi0 = [1 0 ... 0]
(ie. that the
Markov chain starts from the first state).
for i = 1:n_iter [gamma, logl(i)] = mix_post(X, w, mu, Sigma); [mu, Sigma, w] = mix_par(X, gamma, COV_TYPE); end;or, more simply
[w, mu, Sigma, logl] = mix(X, w, mu, Sigma, n_iter);
Olivier Cappé, Aug 24 2001