Sklearn Linear Regression - "IndexError: tuple index out of range"
Asked Answered
L

1

10

I have a ".dat" file in which are saved values of X and Y (so a tuple (n,2) where n is the number of rows).

import numpy as np
import matplotlib.pyplot as plt
import scipy.interpolate as interp
from sklearn import linear_model

in_file = open(path,"r")
text = np.loadtxt(in_file)
in_file.close()
x = np.array(text[:,0])
y = np.array(text[:,1])

I created an instance for linear_model.LinearRegression(), but when I invoke the .fit(x,y) method I get

IndexError: tuple index out of range

regr = linear_model.LinearRegression()
regr.fit(x,y)

What did I do wrong?

Lac answered 24/11, 2014 at 14:25 Comment(6)
Sorry I completely misread your question :( I've deleted the answer, if I can get a fix then I'll un-delete the edited answer. But can you provide more information? Such as your full code?Selffertilization
This is the code you need, there is nothing else important.Lac
Really? What's linear_model? How did you get it?Selffertilization
That's really all now, thanks for the help.Lac
Are x and Y of the same length?Meaty
text.shape is (n,2) so x and y have both (n,)Lac
J
17

Linear Regression expects X as an array with two dimensions and internally requires X.shape[1] to initialize an np.ones array. So converting X to an nx1 array would do the trick. So, replace:

regr.fit(x,y)

by:

regr.fit(x[:,np.newaxis],y)

This will fix the problem. Demo:

>>> from sklearn import datasets
>>> from sklearn import linear_model
>>> clf = linear_model.LinearRegression()
>>> iris=datasets.load_iris()
>>> X=iris.data[:,3]
>>> Y=iris.target
>>> clf.fit(X,Y)  # This will throw an error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/dist-packages/sklearn/linear_model/base.py", line 363, in fit
    X, y, self.fit_intercept, self.normalize, self.copy_X)
  File "/usr/lib/python2.7/dist-packages/sklearn/linear_model/base.py", line 103, in center_data
    X_std = np.ones(X.shape[1])
IndexError: tuple index out of range
>>> clf.fit(X[:,np.newaxis],Y)  # This will work properly
LinearRegression(copy_X=True, fit_intercept=True, normalize=False)

To plot the regression line use the below code:

>>> from matplotlib import pyplot as plt
>>> plt.scatter(X, Y, color='red')
<matplotlib.collections.PathCollection object at 0x7f76640e97d0>
>>> plt.plot(X, clf.predict(X[:,np.newaxis]), color='blue')
<matplotlib.lines.Line2D object at 0x7f7663f9eb90>
>>> plt.show()

enter image description here

Josselyn answered 24/11, 2014 at 16:34 Comment(2)
Thank you very much for the help! Another question: is it normal that now I get only a coefficent from linear regression? How can I plot its line?Lac
@JackLametta, It's absolutely normal. These coefficients are used to predict X value given Y value. I've uploaded the code to plot line.Josselyn

© 2022 - 2024 — McMap. All rights reserved.