Python: How to evaluate the residuals in StatsModels?
Asked Answered
V

3

18

I want to evaluate the residuals: (y-hat y).

I know how to do that:

df = pd.read_csv('myFile', delim_whitespace = True, header = None)
df.columns = ['column1', 'column2']
y, X = ps.dmatrices('column1 ~ column2',data = df, return_type = 'dataframe')
model = sm.OLS(y,X)
results = model.fit()
predictedValues = results.predict()
#print predictedValues
yData = df.as_matrix(columns = ['column1'])
res = yData - predictedValues

I wonder if there is a Method to do this (?).

Varian answered 15/2, 2016 at 19:2 Comment(0)
G
34

That's stored in the resid attribute of the Results class

Likewise there's a results.fittedvalues method, so you don't need the results.predict().

Gravitation answered 15/2, 2016 at 19:11 Comment(1)
This does (currently) not work with the RegularizedResults class.Acerose
S
3

If you are looking for a variety of (scaled) residuals such as externally/internally studentized residuals, PRESS residuals and others, take a look at the OLSInfluence class within statsmodels.

Using the results (a RegressionResults object) from your fit, you instantiate an OLSInfluence object that will have all of these properties computed for you. Here's a short example:

import statsmodels.api as sm
from statsmodels.stats.outliers_influence import OLSInfluence

data = sm.datasets.spector.load(as_pandas=False)
X = data.exog
y = data.endog

# fit the model
model = sm.OLS(y, sm.add_constant(X, prepend=False))
fit = model.fit()

# compute the residuals and other metrics
influence = OLSInfluence(fit)
Subdivide answered 12/8, 2020 at 19:31 Comment(0)
S
1

Normality of the residuals

Option 1: Jarque-Bera test

name = ['Jarque-Bera', 'Chi^2 two-tail prob.', 'Skew', 'Kurtosis']
test = sms.jarque_bera(results.resid)
lzip(name, test)

Out:

[('Jarque-Bera', 3.3936080248431666),
 ('Chi^2 two-tail prob.', 0.1832683123166337),
 ('Skew', -0.48658034311223375),
 ('Kurtosis', 3.003417757881633)]
Omni test:

Option 2: Omni test

name = ['Chi^2', 'Two-tail probability']
test = sms.omni_normtest(results.resid)
lzip(name, test)

Out:

[('Chi^2', 3.713437811597181), ('Two-tail probability', 0.15618424580304824)]
Sleeve answered 4/5, 2020 at 9:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.