Batchnorm2d Pytorch - Why pass number of channels to batchnorm?

Batch normalisation has learnable parameters, because it includes an affine transformation.

From the documentation of nn.BatchNorm2d:

The mean and standard-deviation are calculated per-dimension over the mini-batches and γ and β are learnable parameter vectors of size C (where C is the input size). By default, the elements of γ are set to 1 and the elements of β are set to 0.

Since the norm is calculated per channel, the parameters γ and β are vectors of size num_channels (one element per channel), which results in an individual scale and shift per channel. As with any other learnable parameter in PyTorch, they need to be created with a fixed size, hence you need to specify the number of channels

batch_norm = nn.BatchNorm2d(10)

# γ
batch_norm.weight.size()
# => torch.Size([10])

# β
batch_norm.bias.size()
# => torch.Size([10])

Note: Setting affine=False does not use any parameters and the number of channels wouldn't be needed, but they are still required, in order to have a consistent interface.

Recommended topics

Hot tags