I'm experimenting to decide if a time-series (as in, one list of floats) is correlated with itself. I've already had a play with the acf
function in statsmodels (http://statsmodels.sourceforge.net/devel/generated/statsmodels.tsa.stattools.acf.html), now I'm looking at whether the Durbin–Watson statistic has any worth.
It seems like this kind of thing should work:
from statsmodels.regression.linear_model import OLS
import numpy as np
data = np.arange(100) # this should be highly correlated
ols_res = OLS(data)
dw_res = np.sum(np.diff(ols_res.resid.values))
If you were to run this, you would get:
Traceback (most recent call last):
...
File "/usr/lib/pymodules/python2.7/statsmodels/regression/linear_model.py", line 165, in initialize
self.nobs = float(self.wexog.shape[0])
AttributeError: 'NoneType' object has no attribute 'shape'
It seems that D/W is usually used to compare two time-series (e.g. http://connor-johnson.com/2014/02/18/linear-regression-with-python/) for correlation, so I think the problem is that i've not passed another time-series to compare to. Perhaps this is supposed to be passed in the exog
parameter to OLS
?
exog : array-like
A nobs x k array where nobs is the number of observations and k is
the number of regressors.
(from http://statsmodels.sourceforge.net/devel/generated/statsmodels.regression.linear_model.OLS.html)
Side-note: I'm not sure what a "nobs x k" array means. Maybe an array with is x
by k
?
So what should I be doing here? Am I expected to pass the data
twice,
or to lag it manually myself, or?
Thanks!