According to this question How to get constant term in AR Model with statsmodels and Python?. I'm now trying to use the ARMA model to fit the data but again I couldn't find a way to interpret the model's result. Here what I have done according to ARMA out-of-sample prediction with statsmodels and ARMAResults.predict API document.
# Parameter
INPUT_DATA_POINT = 200
P = 5
Q = 0
# Read Data
data = []
f = open('stock_all.csv', 'r')
for line in f:
data.append(float(line.split(',')[5]))
f.close()
# Fit ARMA-model using the first piece of data
result = arma_model(data[:INPUT_DATA_POINT], P, Q)
# Predict using model (fit dimension is len(data) + 1 why?)
fit = result.predict(0, len(data))
# Plot
plt.figure(facecolor='white')
plt.title('ARMA Model Fitted Using ' + str(INPUT_DATA_POINT) + ' Data Points, P=' + str(P) + ' Q=' + str(Q) + '\n')
plt.plot(data, 'b-', label='data')
plt.plot(range(INPUT_DATA_POINT), result.fittedvalues, 'g--', label='fit')
plt.plot(range(len(data)), fit[:len(data)], 'r-', label='predict')
plt.legend(loc=4)
plt.show()
Here the result which is very strange because it should be nearly identical to the result from my last question as I mention in the link above. Also I'm not quite understand why there is some results for a couple of first data points since that shouldn't be valid (no previous value to compute).
I try to write my own prediction code which is shown below (omitted the top part that is identical to the above code)
# Predict using model
start_pos = max(result.k_ar, result.k_ma)
fit = []
for t in range(start_pos, len(data)):
value = 0
for i in range(1, result.k_ar + 1):
value += result.arparams[i - 1] * data[t - i]
for i in range(1, result.k_ma + 1):
value += result.maparams[i - 1] * data[t - i]
fit.append(value)
# Plot
plt.figure(facecolor='white')
plt.title('ARMA Model Fitted Using ' + str(INPUT_DATA_POINT) + ' Data Points, P=' + str(P) + ' Q=' + str(Q) + '\n')
plt.plot(data, 'b-', label='data')
plt.plot(range(INPUT_DATA_POINT), result.fittedvalues, 'r+', label='fit')
plt.plot(range(start_pos, len(data)), fit, 'r-', label='predict')
plt.legend(loc=4)
plt.show()
This is the best result I got
value = 0
should that bevalue=result.params[0]
? If my code is corrected, the first 200 data points should be equal to the result fromresult.fittedvalues
right? But in this case it isn't. Please correct me if I'm wrong. – Roselba