Do I need to add a constant when using sm.OLS?
Asked Answered
C

1

8

I am performing an OLS on two sets of data Y and X. I use statsmodel.api.OLS. However I found some very different results whether I add a constant to X before or not. Here is the code:

import statsmodels.api as sm
import numpy as np

mess = "SELECT .... FROM... WHERE ...."
data = np.array(db.extractData(mess))
Y = data[,:0]
X = data[,:1]
#Option1 
res = sm.OLS(Y,X).fit().rsquared ---> will return 0.76
#Option2
X = sm.add_constant(X)
res = sm.OLS(Y,X).fit().rsquared ---> will return 0.06

Considering the massive difference whether or not I add the constant, I assume that I am doing something wrong. Thanks very much for your time.

Carob answered 17/5, 2015 at 10:54 Comment(0)
L
3

You need to add the constant. from the documentation:http://www.statsmodels.org/devel/generated/statsmodels.regression.linear_model.OLS.html

An intercept is not included by default and should be added by the user. See statsmodels.tools.add_constant.

Lector answered 24/10, 2017 at 4:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.