Batch normalization instead of input normalization

Asked 16/10, 2017 at 13:48 Answered 10/7, 2023 at 15:48

Solved machine-learning neural-network keras artificial-intelligence batch-normalization

Can I use batch normalization layer right after input layer and not normalize my data? May I expect to get similar effect/performance?

In keras functional it would be something like this:

x = Input (...)
x = Batchnorm(...)(x)
...

Arose answered 16/10, 2017 at 13:48 Comment(0)

You can do it. But the nice thing about batchnorm, in addition to activation distribution stabilization, is that the mean and std deviation are likely migrate as the network learns.

Effectively, setting the batchnorm right after the input layer is a fancy data pre-processing step. It helps, sometimes a lot (e.g. in linear regression). But it's easier and more efficient to compute the mean and variance of the whole training sample once, than learn it per-batch. Note that batchnorm isn't free in terms of performance and you shouldn't abuse it.

Holton answered 16/10, 2017 at 14:0 Comment(5)

what about standardizing the entire data in advance (e.g. through standardscaler) and using batchnormalization as well? – Clearance 7/10, 2019 at 9:37

Does the BN layer need to be added always AFTER the Inputlayer, If one wants to standardize inputs? What happens if I add the BN before the Inputlayer? – Infinite 10/3, 2020 at 15:16

@Infinite Check out kaggle.com/ryanholbrook/…, you can add BN as Input and it acts as sklearn.StandardScaler() – Impearl 18/11, 2021 at 6:44

You should keep in mind that the mean and std for the train and val sets should be calculated separately. – Alasteir 7/2, 2022 at 14:47

@КонстантинПисаный Are you sure? I would think one would want to use the train mean/std for val, or perhaps an updated mean/std that includes both train and val samples. Just thinking in terms of single-sample batches having zero standard deviation - it makes sense why you can't have these for training, but it seems odd to me that one should need multiple samples just to validate. – Rebekahrebekkah 13/9, 2022 at 9:53

Yes this is possible, and I have used it very successfully for vision models. There are some pros and cons of this approach though, the main advantages being:

You can’t forget the normalization step when integrating the model in production since it’s part of the model itself (this happens more often than you think).
The normalization is Data augmentation aware this way.

The main drawbacks are:

Added runtime cost in case you had normalized inputs already available.

I’ve also written about this subject in detail here: Replace Manual Normalization with Batch Normalization in Vision AI Models. https://towardsdatascience.com/replace-manual-normalization-with-batch-normalization-in-vision-ai-models-e7782e82193c

Lallans answered 10/7, 2023 at 15:48 Comment(0)

Recommended topics

Hot tags