I've been trying to get a prediction for future values in a model I've created. I have tried both OLS in pandas and statsmodels. Here is what I have in statsmodels:
import statsmodels.api as sm
endog = pd.DataFrame(dframe['monthly_data_smoothed8'])
smresults = sm.OLS(dframe['monthly_data_smoothed8'], dframe['date_delta']).fit()
sm_pred = smresults.predict(endog)
sm_pred
The length of the array returned is equal to the number of records in my original dataframe but the values are not the same. When I do the following using pandas I get no values returned.
from pandas.stats.api import ols
res1 = ols(y=dframe['monthly_data_smoothed8'], x=dframe['date_delta'])
res1.predict
(Note that there is no .fit function for OLS in Pandas) Could somebody shed some light on how I might get future predictions from my OLS model in either pandas or statsmodel-I realize I must not be using .predict properly and I've read the multiple other problems people have had but they do not seem to apply to my case.
edit I believe 'endog' as defined is incorrect-I should be passing the values for which I want to predict; therefore I've created a date range of 12 periods past the last recorded value. But still I miss something as I am getting the error:
matrices are not aligned
edit here is a snippet of data, the last column (in red) of numbers is the date delta which is a difference in months from the first date:
month monthly_data monthly_data_smoothed5 monthly_data_smoothed8 monthly_data_smoothed12 monthly_data_smoothed3 date_delta
0 2011-01-31 3.711838e+11 3.711838e+11 3.711838e+11 3.711838e+11 3.711838e+11 0.000000
1 2011-02-28 3.776706e+11 3.750759e+11 3.748327e+11 3.746975e+11 3.755084e+11 0.919937
2 2011-03-31 4.547079e+11 4.127964e+11 4.083554e+11 4.059256e+11 4.207653e+11 1.938438
3 2011-04-30 4.688370e+11 4.360748e+11 4.295531e+11 4.257843e+11 4.464035e+11 2.924085
df.to_dict
and paste that – Fastasmresults.summary()
to see how well the model is fit. – Vetter