Python pandas has no attribute ols - Error (rolling OLS)
Asked Answered
M

2

7

For my evaluation, I wanted to run a rolling 1000 window OLS regression estimation of the dataset found in this URL: https://drive.google.com/open?id=0B2Iv8dfU4fTUa3dPYW5tejA0bzg using the following Python script.

# /usr/bin/python -tt

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from statsmodels.formula.api import ols

df = pd.read_csv('estimated.csv', names=('x','y'))

model = pd.stats.ols.MovingOLS(y=df.Y, x=df[['y']], 
                               window_type='rolling', window=1000, intercept=True)
df['Y_hat'] = model.y_predict

However, when I run my Python script, I am getting this error: AttributeError: module 'pandas.stats' has no attribute 'ols'. Could this error be from the version that I am using? The pandas installed on my Linux node has a version of 0.20.2

Meghanmeghann answered 22/6, 2017 at 18:58 Comment(13)
What happens with from pandas.stats import ols?Geotaxis
It says ImportError: cannot import name 'ols'.Meghanmeghann
What do you get with print (dir(pd.stats))? I'm not at laptop atm, will be back home soon to test myself. Is it in the list?Geotaxis
This is what i get. ['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'api', 'moments']. It seems the ols is not listed.Meghanmeghann
What did you call your script?Geotaxis
I call it reg_test.py.Meghanmeghann
Interesting. I'm on version 0.17 and it's there in dir. I guess they restructured.Geotaxis
Very strange. Let me see if there is a way to install it with the latest version of pandas - otherwise I will rollback my current version to 0.17 which i think is not a good idea though.Meghanmeghann
It possibly comes with another module now. I'm looking into it too, don't roll back, it's just a shame that the docs don't make it explicit where it's imported from nowGeotaxis
what about from statsmodels.regression.linear_model import OLS taken from statsmodels.org/dev/importpaths.html#import-examplesTyrocidine
@downshift you might be on the right lines here. It looks like it was deprecated in pandas github.com/pandas-dev/pandas/…Geotaxis
right, i think since v0.20.0 - github.com/pandas-dev/pandas/blob/…Tyrocidine
@downshift then you have the answer :) Quite a substantial change there!Geotaxis
D
9

pd.stats.ols.MovingOLS was removed in Pandas version 0.20.0

http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#whatsnew-0200-prior-deprecations

https://github.com/pandas-dev/pandas/pull/11898

I can't find an 'off the shelf' solution for what should be such an obvious use case as rolling regressions.

The following should do the trick without investing too much time in a more elegant solution. It uses numpy to calculate the predicted value of the regression based on the regression parameters and the X values in the rolling window.

window = 1000
a = np.array([np.nan] * len(df))
b = [np.nan] * len(df)  # If betas required.
y_ = df.y.values
x_ = df[['x']].assign(constant=1).values
for n in range(window, len(df)):
    y = y_[(n - window):n]
    X = x_[(n - window):n]
    # betas = Inverse(X'.X).X'.y
    betas = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
    y_hat = betas.dot(x_[n, :])
    a[n] = y_hat
    b[n] = betas.tolist()  # If betas required.

The code above is equivalent to the following and about 35% faster:

model = pd.stats.ols.MovingOLS(y=df.y, x=df.x, window_type='rolling', window=1000, intercept=True)
y_pandas = model.y_predict
Dennie answered 22/6, 2017 at 20:0 Comment(6)
Yes, that's right as I have learned from the comments above. So, do you have any idea on how we can use it with the latest version of Pandas?Meghanmeghann
@DestaHaileselassieHagos What results do you want from the rolling regression (e.g. slope, intercept, predicted value, etc)Dennie
@Alexander, for example predicted value. Thanks!Meghanmeghann
I actually rollback my pandas version to 0.18.0 and the ols is working now. Thank you so much!Meghanmeghann
@DestaHaileselassieHagos does the old package have any statistical feature which allow you to calculate the significance of the coefficients?Goodoh
@Lost1, no it doesn't.Meghanmeghann
S
0

It was deprecated in favor of statsmodels.

See here examples how to use statsmodels rolling regression.

Sianna answered 3/12, 2020 at 2:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.