ValueError: operands could not be broadcast together with shapes - inverse_transform- Python
Asked Answered
P

3

8

I know ValueError question has been asked many times. I am still struggling to find an answer because I am using inverse_transform in my code.

Say I have an array a

a.shape
> (100,20)

and another array b

b.shape
> (100,3)

When I did a np.concatenate,

hat = np.concatenate((a, b), axis=1)

Now shape of hat is

hat.shape    
(100,23)

After this, I tried to do this,

inversed_hat = scaler.inverse_transform(hat)

When I do this, I am getting an error:

ValueError: operands could not be broadcast together with shapes (100,23) (25,) (100,23)

Is this broadcast error in inverse_transform? Any suggestion will be helpful. Thanks in advance!

Pirandello answered 23/8, 2017 at 18:29 Comment(0)
A
9

Although you didn't specify, I'm assuming you are using inverse_transform() from scikit learn's StandardScaler. You need to fit the data first.

import numpy as np
from sklearn.preprocessing import MinMaxScaler


In [1]: arr_a = np.random.randn(5*3).reshape((5, 3))

In [2]: arr_b = np.random.randn(5*2).reshape((5, 2))

In [3]: arr = np.concatenate((arr_a, arr_b), axis=1)

In [4]: scaler = MinMaxScaler(feature_range=(0, 1)).fit(arr)

In [5]: scaler.inverse_transform(arr)
Out[5]:
array([[ 0.19981115,  0.34855509, -1.02999482, -1.61848816, -0.26005923],
       [-0.81813499,  0.09873672,  1.53824716, -0.61643731, -0.70210801],
       [-0.45077786,  0.31584348,  0.98219019, -1.51364126,  0.69791054],
       [ 0.43664741, -0.16763207, -0.26148908, -2.13395823,  0.48079204],
       [-0.37367434, -0.16067958, -3.20451107, -0.76465428,  1.09761543]])

In [6]: new_arr = scaler.inverse_transform(arr)

In [7]: new_arr.shape == arr.shape
Out[7]: True
Atrocity answered 23/8, 2017 at 18:52 Comment(7)
thank you for your response, I know, I should have mentioned, I used MinMaxScaler. For example: scaler = MinMaxScaler(feature_range=(0, 1)).Pirandello
I tried your answer, it works when I have fit, but I have fit_transform it gives an error AttributeError: 'numpy.ndarray' object has no attribute 'inverse_transform'. I used fit_transform. Do you know why this is happening? I am searching about this.Pirandello
Yes, fit_transform() returns a dataset, fit() will produce an object from with which you can call other methods. If you read the docs you can see fit() has no return type, where as fit_transform() returns a numpy array.Atrocity
thank you! That is helpful! I am having a similar issue like this datascience.stackexchange.com/questions/22488/… and this one tutel.me/c/programming/questions/42997228/…Pirandello
@Jesse Ya in both of those questions you need to be doing something like scaler = MinMaxScaler().fit(dataset) and then to scale your dataset, do scaled_data = scaler.transform(dataset) and then at the end when you are trying to do and inverse_transform, do scaler.inverse_transform(inv_yhat)Atrocity
Let us continue this discussion in chat.Atrocity
Why would you use an array with the same shape as the input for the inverse transformation? It should of the number of features selected. This does not make sense at all. Inverse transform should project back from PCA space to original space...Mohun
L
0

The problem here is that the scaler has the information of your 25-column df, but you have updated your df to 23 columns, so it cannot do the 'inverse' function.

To fix the problem, you can do the fit on the 23-column original dataframe, and then do the 'inverse' on your desired 23-column dataframe.

More info: scaler object keeps track of the information needed to perform the inverse transformation. When you fit a scaler to a dataset using the fit() method, the scaler computes the statistics (such as mean and variance for StandardScaler or minimum and maximum for MinMaxScaler) of the data and stores them in its internal state.

Lowell answered 20/3, 2023 at 5:26 Comment(0)
M
-1

It seems you are using pre-fit scaler object of sklearn.preprocessing. If it's true, according to me data that you have used for fitting is of dimension (x,25) whereas your data shape is of (x,23) dimension and thats the reason you are getting this issue.

Madly answered 1/5, 2019 at 5:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.