It seems all three functions can do simple linear regression, e.g.
scipy.stats.linregress(x, y)
numpy.polynomial.polynomial.polyfit(x, y, 1)
x = statsmodels.api.add_constant(x)
statsmodels.api.OLS(y, x)
I wonder if there is any real difference between the three methods? I know that statsmodels
are built on top of scipy
, and scipy
is kinda dependent on numpy
for many things, so I expect that they should not differ vastly, but devil is always in the details.
More specifically, if we use the numpy
method above, how do we get the p-value
of the slope which is given by default by the other two methods?
I am using them in Python 3, if that makes any difference.
np.polyfit
is - it is not really designed for linear regression, but can instead fit a polynomial of arbitrary order to the relationship between x & y (whereaslinregress
can only fit a line). – Nazi