Does normalizing images by dividing by 255 leak information between train and test set?
Asked Answered
G

2

6

I've seen division by 255 used many times as normalization in CNN tutorials online, and this is done across the entire dataset before train test split.

I was under the impression that the test set should be normalized according to the mean/std/maxmin etc. of the training set. By using /255 across the whole dataset, apparently we are giving the training set a feel for the test set. Is that true?

What's the right approach here?

This:

x_train = (x_train - x_train_mean)/x_train_std
x_test = (x_test - x_test_mean)/x_test_std

or this:

x_train = (x_train - x_train_mean)/x_train_std
x_test = (x_test - x_train_mean)/x_train_std

or this:

data/255

Thanks

I've been asked to provide background to what I've tried: This seems to be ungoogleable, I haven't found any discussion on it.

edit: Just another thought.

Because both train and test set are already on the same scale (ie. each pixel from 0-255) I assume that dividing by 255 doesn't make a difference, now they're on the same scale, but from 0-1.

Geomancer answered 26/4, 2019 at 1:41 Comment(1)
It's most likely because the images are stored as 8-bit integers (values from 0 to 255), but we'd like them to be floats in the range 0 to 1. It's just a mathematical convenience. E.g. see stats.stackexchange.com/questions/305262/… and https://mcmap.net/q/738730/-normalizing-to-0-1-vs-1-1Sauerkraut
U
6

Your guess is correct, dividing an image by 255 simply rescales the image from 0-255 to 0-1. (Converting it to float from int makes computation convenient too) However, neither is required. When zero-centering the data,the mean, however, can't leak into the testing set: (http://cs231n.github.io/neural-networks-2/#datapre)

x_train = (x_train - x_train_mean)

x_test = (x_test - x_train_mean)

Moreover, you can use sklearn's Pipeline class(https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html) and use fit() and/or fit_transform() methods to simplify the process.If you're using Keras, there's a wrapper for it

Undertrick answered 9/5, 2019 at 16:18 Comment(0)
J
2

I will just speculate a bit.

The pixel values in a grayscale image are in [0, 255]. However, many images may be in a narrow range. For example, an image can be [100-150].

When you scale this image by 255.0, then your range will be approx [0.4-0.6]. However, when you do (im - mean(im))/std(im), this range will be expanded nicely.

I tested something very simple on python.

def get_zero_mean_std(a):
    a = (a - np.mean(a))/np.std(a)
    print(a)

get_zero_mean_std(np.array([3,2,1, 6]))

[ 0. -0.535 -1.069 1.604]

get_zero_mean_std(np.array([3,2,1, 15]))

[-0.397 -0.573 -0.749 1.719]

get_zero_mean_std(np.array([3,2,1,3,1,2,1,1,2]))

[ 1.556 0.283 -0.99 1.556 -0.99 0.283 -0.99 -0.99 0.283]

As you can see, it is putting the values in a nice range.

If I normalized by 255. or maximum value, the first 3 values of the second array would have been in a very narrow range whereas the last value would have been higher.

So, long story short, one reason might be that (im - mean(im))/std(im) is a better normalizer than regular division.

Jumbled answered 31/10, 2020 at 16:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.