Prediction step for time series using continuous hidden Markov models

I am trying to predict stock market using a Gaussian HMM. I am not getting how the prediction step is done after the model has been trained. I did not understand how exactly predicting the most likely state sequence can help to predict future value.

One of the question asked suggest this method: "Use the Viterbi algorithm with the (partial) sequence to obtain the most likely hidden-state-sequence. Take the emission distribution of the last hidden state in this sequence and predict e.g. the mean of that distribution (which often is Gaussian)."

I did not get what he says after predicting most likely state sequence.

I have trained my model using functions available with hmmlearn in python. I have also applied Viterbi algorithm over the sample to predict the possible hidden state sequence. But I have no idea what to do after that. I am not good with maths of continuous HMM. Please tell me how exactly prediction is done.

Code:

import numpy as np 
from hmmlearn import hmm
import pandas as pd

np.random.seed(42)
model = hmm.GaussianHMM(n_components=3, covariance_type="full",algorithm='viterbi')
model.startprob_ = np.array([0.3,0.4,0.6])
model.transmat_ = np.array([[0.7, 0.2, 0.1], [0.3, 0.5, 0.2], [0.3, 0.3, 0.4]])
model.means_ = np.array([[0.0], [3.0], [5.0]])
model.covars_ = np.tile(np.identity(1), (3, 1, 1))

df = pd.read_csv("HistoricalQuotes.csv")
Y = df['close'][2:40]
Y = Y[::-1]
X = np.array(Y)
X = np.reshape(X, (-1,1))

model.fit(X)

Y = df['close'][40:55]
Y = Y[::-1]
X = np.array(Y)

Z =  model.predict(X)

You are not so far from your goal!

I have also applied Viterbi algorithm over the sample to predict the possible hidden state sequence

With the Viterbi algorithm you actually predicted the most likely sequence of hidden states. The last state corresponds to the most probable state for the last sample of the time series you passed as an input.

In order to predict the next sample you need to estimate which state is the next emission most likely to come from.

To do this, you can use the state transition matrix that has been estimated during the training phase i.e., the updated value of model.transmat_.

Once you most likely state for the next sample is predicted, you can use the Gaussian distribution that is associated to that state. Let's say you predicted state K, then the parameters of the Gaussian distribution will be found in the updated values of model.means_[K] and model.covars_[K] (by updated, I mean updated during the training phase).

Then several choices are offered to you: you can either choose to draw a random sample from the Gaussian distribution or choose to assign the new sample to the value of the mean of the Gaussian. That depends on your goals and of the problem you are solving.

Recommended topics

Hot tags