How can I fit the test data using min max scaler when I am loading the model?
Asked Answered
L

1

5

I am doing auto encoder model.I have saved the model before which I scaled the data using min max scaler.

X_train = df.values
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)

After doing this I fitted the model and saved it as 'h5' file.Now when I give test data, after loading the saved model naturally it should be scaled as well.

So when I load the model and scale it by using

X_test_scaled  = scaler.transform(X_test)

It gives the error

NotFittedError: This MinMaxScaler instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

So I gave X_test_scaled = scaler.fit_transform(X_test) (Which I had a hunch that it is foolish)did gave a result(after loading saved model and test) which was different when I trained it and test it together. I have saved around 4000 models now for my purpose(So I cant train and save it all again as it costs a lot time,So I want a way out).

Is there a way I can scale the test data by transforming it the way I trained it(may be saving the scaled values, I do not know).Or may be descale the model so that I can test the model on non-scaled data.

If I under-emphasized or over-emphasized any point ,please let me know in the comments!

Larvicide answered 28/2, 2019 at 7:44 Comment(0)
C
8
X_test_scaled  = scaler.fit_transform(X_test)

will scale X_test given the minimum and maximum values of features in X_test and not X_train.

The reason your original code did not work is because you probably did not save scaler after fitting it to X_train or overwrote it somehow (for e.g., by re-initializing it). This is why the error was thrown as scaler was not fitted to any data.

When you then call X_test_scaled = scaler.fit_transform(X_test), you are fitting scaler to X_test and simultaneously tranforming X_test, which was why the code was able to run, but this step is incorrect as you already surmised.

What you want is

X_train = df.values
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)

# Save scaler
import pickle as pkl
with open("scaler.pkl", "wb") as outfile:
    pkl.dump(scaler, outfile)

# Some other code for training your autoencoder
# ...

Then in your test script

# During test time
# Load scaler that was fitted on training data
with open("scaler.pkl", "rb") as infile:
    scaler = pkl.load(infile)
    X_test_scaled = scaler.transform(X_test)  # Note: not fit_transform.

Note you don't have to re-fit the scaler object after loading it back from disk. It contains all the information (the scaling factors etc.) obtained from the training data. You just call it on X_test.

Cadena answered 28/2, 2019 at 7:50 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.