What does it mean to "break symmetry"? in the context of neural network programming? [duplicate]
Asked Answered
I

1

10

I have heard a lot about "breaking the symmetry" within the context of neural network programming and initialization. Can somebody please explain what this means? As far as I can tell, it has something to do with neurons performing similarly during forward and backward propagation if the weight matrix is filled with identical values during initialization. Asymmetrical behavior would be more clearly replicated with random initialization, i.e., not using identical values throughout the matrix.

Intranuclear answered 8/1, 2020 at 2:32 Comment(1)
@gman Good catch. This should be a dup of that, thenFloozy
F
8

Your understanding is correct.

When all initial values are identical, for example initialize every weight to 0, then when doing backpropagation, all weights will get the same gradient, and hence the same update. This is what is referred to as the symmetry.

Intuitively, that means all nodes will learn the same thing, and we don't want that, because we want the network to learn different kinds of features. This is achieved by random initialization, since then the gradient will be different, and each node will grow to be more distinct to other nodes, enabling diverse feature extraction. This is what is referred to as breaking the symmetry.

Floozy answered 8/1, 2020 at 2:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.