Next: Simple types: ex_basic Up: Models with multivariate Gaussian Previous: Models with multivariate Gaussian

Data structures

No specific data structures have been used, so that a HMM with multivariate Gaussian state conditional distribution consists of:
pi0
Row vector containing the probability distribution for the first (unobserved) state: $ \pi_0(i) = P(s_1 = i).$
A
Transition matrix: $ a_{ij} = P(s_t+1 = j \vert s_t = i).$
mu
Mean vectors (of the state-conditional distributions) stacked as row vectors, such that mu(i,:) is the mean (row) vector corresponding to the i-th state of the HMM.
Sigma
Covariance matrices. These are stored one above the other in two different way depending on whether full or diagonal covariance matrices are used: for full covariance matrices,
Sigma((1+(i-1)*p):(i*p),:)
(where p is the dimension of the observation vectors) is the covariance matrix corresponding to the i-th state; for diagonal covariance matrices, Sigma(i,:) contains the diagonal of the covariance matrix for the i-th state (ie. the diagonal coefficients stored as row vectors).
For a left-right HMM, pi0 is assumed to be deterministic (ie. pi0 = [1 0 ... 0]) and A can be made sparse in order to save memory space (A should be upper triangular for a left-right model). Using sparse matrices is however not possible if you want to compile your m-files using mcc (MATLAB) or if you are using OCTAVE.

A Gaussian mixture model, is rather similar except that as the underlying jump process being i.i.d., pi0 and A are replaced by a single row vector containing the mixture weights w defined by $ w(i) = P(s_t = i).$

Most functions (those that have mu and Sigma among their input arguments) are able to determine the dimensions of the model (size of observation vectors and number of states) and the type of covariance matrices (full or diagonal) from the size of their input arguments. This is achieved by the two functions hmm_chk and mix_chk.

For more specialized variables such as those that are used during the forward-backward recursions, I have tried to use the notations of L. R. Rabiner in [4] (or [3]) which seem pretty standard:

alpha
Forward variables: $ \alpha_{t}(i) = P(X_1, \ldots, X_t, S_t =
i)$.
beta
Backward variables: $ \beta_{t}(i) = P(X_{t+1}, \ldots, X_T \vert S_t =
i)$.
gamma
A posteriori distributions of the states:

$\displaystyle \gamma_{t}(i) = P(S_t = i \vert X_{1}, \ldots, X_T)
$

I have also tried to use systematically the convention of multivariate data analysis that the matrices should have ``more rows than columns'', so that the observation vectors are stacked in X as row vectors (the number of observed vectors being usually greater than their dimension). The same is true for alpha, beta and gamma which are T*N matrices (where T is the number of observation vectors and N the number of states).


Next: Simple types: ex_basic Up: Models with multivariate Gaussian Previous: Models with multivariate Gaussian

Olivier Cappé, Aug 24 2001