Expected 2D array, got 1D array instead error

Asked 24/10, 2018 at 5:52 Answered 5/2, 2022 at 5:12

Solved python machine-learning data-science

Iam getting the error as

"ValueError: Expected 2D array, got 1D array instead: array=[ 45000. 50000. 60000. 80000. 110000. 150000. 200000. 300000. 500000. 1000000.]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample."

while executing the following code:

# SVR

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv('Position_S.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values

 # Feature Scaling
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
X = sc_X.fit_transform(X)
y = sc_y.fit_transform(y)

# Fitting SVR to the dataset
from sklearn.svm import SVR
regressor = SVR(kernel = 'rbf')
regressor.fit(X, y)

# Visualising the SVR results
plt.scatter(X, y, color = 'red')
plt.plot(X, regressor.predict(X), color = 'blue')
plt.title('Truth or Bluff (SVR)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()

# Visualising the SVR results (for higher resolution and smoother curve)
X_grid = np.arange(min(X), max(X), 0.01)
X_grid = X_grid.reshape((len(X_grid), 1))
plt.scatter(X, y, color = 'red')
plt.plot(X_grid, regressor.predict(X_grid), color = 'blue')
plt.title('Truth or Bluff (SVR)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()

Lessen answered 24/10, 2018 at 5:52 Comment(2)

sklearn requires 2D input. Simply use fit(X[:,None], y) – Haemostasis 24/10, 2018 at 5:56

thax ZislsNotZis – Lessen 24/10, 2018 at 6:27

Seems, expected dimension is wrong. Could you try:

regressor = SVR(kernel='rbf')
regressor.fit(X.reshape(-1, 1), y)

Embayment answered 24/10, 2018 at 6:26 Comment(3)

could u please tell what does this x.reshape(-1,1).actually do.It also solves the problem.I actually solved it by changing into y = dataset.iloc[:, 2:3].values eventhough i gave only 3 columns. – Lessen 25/10, 2018 at 2:20

According to the error message, you have input data in the format [45000, 50000, 60000, ...]. But the model expects the input in the format like [[45000], [50000], [60000], ...] - a list of the lists. So reshape(-1, 1) just changes a format. – Embayment 25/10, 2018 at 6:56

Note that reshape() is now deprecated. use df.values.reshape() instead. – Westbrooke 27/5, 2021 at 8:41

The problem is if you type y.ndim, you will see the dimension as 1, and if you type X.ndim, you will see the dimension as 2.

So to solve this problem you have to change the result of y.ndim from 1 to 2.

For this just use the reshape function that comes under numpy class.

data=pd.read_csv("Position_Salaries.csv")
X=data.iloc[:,1:2].values
y=data.iloc[:,2].values
y=np.reshape(y,(10,1))

It should solve the problem caused due to dimension. Do the regular Feature Scaling after the above code and it will work for sure.

Do vote if it works for you.

Thanks.

Architectonics answered 25/7, 2020 at 4:54 Comment(0)

from sklearn.preprocessing import StandardScaler  

#Creating two objects for dependent and independent variable 
ss_X = StandardScaler()
ss_y = StandardScaler()

X = ss_X.fit_transform(X)
y = ss_y.fit_transform(y.reshape(-1,1))

After Reshape thing it will work fine

Statued answered 5/2, 2022 at 5:12 Comment(0)

Recommended topics

Hot tags