How to set intercept to 0 with statsmodel - for multiple linear regression
Asked Answered
S

1

6

There was a post a few years ago on this but only a quick fix was included. Specifying a Constant in Statsmodels Linear Regression?

Quick fix was to run first and subtract interception equation And then run again. Tedious if you are running over and over again.

I would think you could pass a parameter telling it to set the intercept as zero. Also open to using other stats packages besides statsmodels.

Syrupy answered 18/1, 2019 at 15:43 Comment(1)
Note, that the referenced question was for a fixed, non-zero constant, which is more difficult to do than just leaving out the intercept completely.Hatred
G
15

It depends which api you use. If you are using statsmodels.api then you need to explicitly add the constant to your model by adding a column of 1s to exog. If you don't then there is no intercept.

import pandas as pd
import statsmodels.formula.api as smf
import statsmodels.api as sm

df = pd.DataFrame({'x': range(0,10)}).assign(y=lambda x: x+8)

# Fit y = B*x, no intercept
res1 = sm.OLS(endog=df.y, exog=df.x).fit()
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x              2.2632      0.269      8.421      0.000       1.655       2.871
==============================================================================


# fit y = B*x + C, by adding a column of ones
res2 = sm.OLS(endog=df.y, exog=df[['x']].assign(intercept=1)).fit()
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x              1.0000   8.64e-16   1.16e+15      0.000       1.000       1.000
intercept      8.0000   4.61e-15   1.73e+15      0.000       8.000       8.000
==============================================================================

If instead you are using the smf api, you can add -1 to the Patsy formula, which will tell it to remove the constant, otherwise the Intercept is included.

res3 = smf.ols('y ~ x -1', data=df).fit()
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x              2.2632      0.269      8.421      0.000       1.655       2.871
==============================================================================

# Default includes the constant
res4 = smf.ols('y ~ x', data=df).fit()
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      8.0000   2.72e-15   2.94e+15      0.000       8.000       8.000
x              1.0000   5.09e-16   1.96e+15      0.000       1.000       1.000
==============================================================================
Gaff answered 18/1, 2019 at 16:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.