| Title: | Hidden Markov Model by Matrix and Tensor Decomposition |
|---|---|
| Description: | Solves Hidden Markov Models (HMMs) via matrix and tensor decomposition. Converts observation sequences to co-occurrence matrices/tensors and applies Symmetric Non-negative Matrix Factorization (symNMF), Singular Value Decomposition (SVD), CANDECOMP/PARAFAC (CP) decomposition, or Tensor-Train (TT) decomposition to recover HMM parameters. Also provides standard HMM algorithms (Forward, Backward, Viterbi, Baum-Welch) for comparison. The spectral learning approach for HMMs is based on Hsu, Kakade, and Zhang (2012) <doi:10.1016/j.jcss.2011.12.025>. The symNMF method is described in Kuang, Yun, and Park (2015) <doi:10.1007/s10898-014-0247-2>. The Tensor-Train decomposition is described in Oseledets (2011) <doi:10.1137/090752286>. |
| Authors: | Koki Tsuyuzaki [aut, cre] |
| Maintainer: | Koki Tsuyuzaki <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-28 11:10:49 UTC |
| Source: | https://github.com/cran/hmmTensor |
Computes backward probabilities
using the same scaling factors from the Forward algorithm.
Backward(Y, T_mat, O, scale)Backward(Y, T_mat, O, scale)
Y |
Integer vector of observations (values in 1:N) |
T_mat |
Transition matrix (K x K), |
O |
Emission matrix (K x N), |
scale |
Scaling factors from |
Matrix (K x T_len) of scaled backward probabilities
Estimates HMM parameters (T, O, pi) from observation sequences using the Expectation-Maximization algorithm.
BaumWelch( Y, K, N, initT = NULL, initO = NULL, initPi = NULL, num.iter = 100L, thr = 1e-06, verbose = FALSE )BaumWelch( Y, K, N, initT = NULL, initO = NULL, initPi = NULL, num.iter = 100L, thr = 1e-06, verbose = FALSE )
Y |
Integer vector of observations (values in 1:N), or a list of integer vectors for multiple sequences. |
K |
Number of hidden states |
N |
Number of distinct observation symbols |
initT |
Initial transition matrix (K x K). If NULL, random initialization. |
initO |
Initial emission matrix (K x N). If NULL, random initialization. |
initPi |
Initial state distribution (length K). If NULL, uniform. |
num.iter |
Maximum EM iterations (default: 100) |
thr |
Convergence threshold on log-likelihood relative change (default: 1e-6) |
verbose |
Logical (default: FALSE) |
A list with components:
Estimated transition matrix (K x K)
Estimated emission matrix (K x N)
Estimated initial distribution (length K)
Vector of log-likelihoods per iteration
Logical
Number of iterations
Computes forward probabilities
with log-scaling to avoid underflow.
Forward(Y, T_mat, O, pi0)Forward(Y, T_mat, O, pi0)
Y |
Integer vector of observations (values in 1:N) |
T_mat |
Transition matrix (K x K), |
O |
Emission matrix (K x N), |
pi0 |
Initial state distribution (length K) |
A list with components:
Matrix (K x T_len) of scaled forward probabilities
Log-likelihood of the observation sequence
Vector of scaling factors (length T_len)
Estimates HMM parameters by decomposing the observation co-occurrence matrix/tensor. Supports multiple decomposition solvers.
HMM( Y, K, N = NULL, solver = c("symNMF", "SVD", "CP", "TT"), Beta = 2, order = 2L, lag = 1L, smooth = 1e-10, num.iter = 100L, thr = 1e-10, verbose = FALSE )HMM( Y, K, N = NULL, solver = c("symNMF", "SVD", "CP", "TT"), Beta = 2, order = 2L, lag = 1L, smooth = 1e-10, num.iter = 100L, thr = 1e-10, verbose = FALSE )
Y |
Integer vector of observations (values in 1:N), or a list of integer vectors for multiple sequences. |
K |
Number of hidden states |
N |
Number of distinct observation symbols. If NULL, inferred from data. |
solver |
Decomposition method:
|
Beta |
Beta-divergence parameter for symNMF (default: 2). Ignored for other solvers. |
order |
Co-occurrence order for Seq2Prob: 2 (default) or 3.
|
lag |
Lag for pairwise co-occurrence (default: 1) |
smooth |
Laplace smoothing pseudo-count (default: 1e-10) |
num.iter |
Maximum iterations for iterative solvers (default: 100) |
thr |
Convergence threshold (default: 1e-10) |
verbose |
Logical (default: FALSE) |
The pipeline:
Convert observations to co-occurrence matrix via Seq2Prob
Decompose using the chosen solver
Recover HMM parameters (T, O, pi) from the decomposition
For NMF-based methods, where
M relates to the emission matrix O and to diag(pi) %*% T.
A list with components:
Estimated transition matrix (K x K)
Estimated emission matrix (K x N)
Estimated initial distribution (length K)
Co-occurrence matrix/tensor used
Raw decomposition result
Solver used
Reconstruction error (if available)
set.seed(42) toy <- toyModel(type = "simple") result <- HMM(toy$Y, K = toy$K, N = toy$N, solver = "symNMF") result$T_matset.seed(42) toy <- toyModel(type = "simple") result <- HMM(toy$Y, K = toy$K, N = toy$N, solver = "symNMF") result$T_mat
Constructs empirical co-occurrence statistics from observation sequences. These form the basis for matrix/tensor decomposition approaches to HMM.
Seq2Prob(Y, N = NULL, order = 2L, lag = 1L, smooth = 0)Seq2Prob(Y, N = NULL, order = 2L, lag = 1L, smooth = 0)
Y |
Integer vector of observations (values in 1:N), or a list of integer vectors for multiple sequences. |
N |
Number of distinct observation symbols. If NULL, inferred from data. |
order |
Co-occurrence order: 2 (pairwise, default) or 3 (triple).
Order 2 gives an N x N matrix |
lag |
Lag for pairwise co-occurrence (default: 1).
|
smooth |
Laplace smoothing pseudo-count (default: 0).
Adds |
For order = 2: an N x N matrix (normalized).
For order = 3: an rTensor Tensor object of dimension N x N x N.
Y <- c(1, 2, 3, 1, 2, 3, 1, 2, 3) P2 <- Seq2Prob(Y, N = 3, order = 2) P3 <- Seq2Prob(Y, N = 3, order = 3)Y <- c(1, 2, 3, 1, 2, 3, 1, 2, 3) P2 <- Seq2Prob(Y, N = 3, order = 2) P3 <- Seq2Prob(Y, N = 3, order = 3)
Creates synthetic HMM data with visually clear structure for demonstrations and testing.
toyModel(type = c("simple", "weather", "leftright"), T_len = 500L, seed = NULL)toyModel(type = c("simple", "weather", "leftright"), T_len = 500L, seed = NULL)
type |
Type of toy model:
|
T_len |
Length of observation sequence (default: 500) |
seed |
Random seed (default: NULL) |
A list with components:
Integer vector of observations
Integer vector of true hidden states
True transition matrix
True emission matrix
True initial distribution
Number of hidden states
Number of observation symbols
Model type
toy <- toyModel("simple", T_len = 200, seed = 42) table(toy$X) table(toy$Y)toy <- toyModel("simple", T_len = 200, seed = 42) table(toy$X) table(toy$Y)
Finds the most likely state sequence given the observations using the Viterbi algorithm (log-space).
Viterbi(Y, T_mat, O, pi0)Viterbi(Y, T_mat, O, pi0)
Y |
Integer vector of observations (values in 1:N) |
T_mat |
Transition matrix (K x K), |
O |
Emission matrix (K x N), |
pi0 |
Initial state distribution (length K) |
A list with components:
Integer vector of most likely states (length T_len)
Log-likelihood of the best path