statsmodels add_constant for OLS intercept, what is this actually doing?

About

Asked 31/12, 2016 at 2:8 Answered 13/4, 2017 at 16:24

Solved python linear-regression statsmodels

Reviewing linear regressions via statsmodels OLS fit I see you have to use add_constant to add a constant '1' to all your points in the independent variable(s) before fitting. However my only understanding of intercepts in this context would be the value of y for our line when our x equals 0, so I'm not clear what purpose always just injecting a '1' here serves. What is this constant actually telling the OLS fit?

Parrott answered 31/12, 2016 at 2:8 Comment(0)

It doesn't add a constant to your values, it adds a constant term to the linear equation it is fitting. In the single-predictor case, it's the difference between fitting an a line y = mx to your data vs fitting y = mx + b.

Paraclete answered 31/12, 2016 at 2:10 Comment(2)

so all the constant is doing is indicating there is a "b" in the equation? – Parrott 31/12, 2016 at 2:46

@TimLindsey: In essence, yes. It tells the model to fit a value for b as well as coefficients for your predictors. I've never really understood why statsmodels requires you to add this explicitly, since as described here you pretty much always want to do it unless you have a specific justification for not doing so. – Paraclete 31/12, 2016 at 2:52

statsmodels' sm.add_constant is the same as the parameter fit_intercept in scikit-learn's LinearRegression().

If you don't do sm.add_constant or if you do LinearRegression(fit_intercept=False), both algorithms assume that b = 0 in y = mx + b. Therefore, they will fit the model using b = 0 instead of calculating what b is supposed to be based on your data.

Ingravescent answered 13/4, 2017 at 16:24 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags