Batchnorm2d Pytorch - Why pass number of channels to batchnorm?
Asked Answered
T

1

9

Why do I need to pass the previous nummber of channels to the batchnorm? The batchnorm should normalize over each datapoint in the batch, why does it need to have the number of channels then ?

Tuscany answered 27/5, 2020 at 11:12 Comment(0)
D
13

Batch normalisation has learnable parameters, because it includes an affine transformation.

From the documentation of nn.BatchNorm2d:

BatchNorm Formular

The mean and standard-deviation are calculated per-dimension over the mini-batches and γ and β are learnable parameter vectors of size C (where C is the input size). By default, the elements of γ are set to 1 and the elements of β are set to 0.

Since the norm is calculated per channel, the parameters γ and β are vectors of size num_channels (one element per channel), which results in an individual scale and shift per channel. As with any other learnable parameter in PyTorch, they need to be created with a fixed size, hence you need to specify the number of channels

batch_norm = nn.BatchNorm2d(10)

# γ
batch_norm.weight.size()
# => torch.Size([10])

# β
batch_norm.bias.size()
# => torch.Size([10])

Note: Setting affine=False does not use any parameters and the number of channels wouldn't be needed, but they are still required, in order to have a consistent interface.

Dilate answered 27/5, 2020 at 12:4 Comment(2)
Ahh okay! Thank you for this well explained enlightment :DTuscany
Just a tiny addition: If you did set affine=False you would actually still need the number of channels since they are also used to initialize buffers for storing the running stats. That can be seen in the code here: pytorch.org/docs/stable/_modules/torch/nn/modules/… Now, if you also set track_running_stats=False, then I agree that you would not need the number of channels as a parameter.Kehr

© 2022 - 2025 — McMap. All rights reserved.