How to choose the window size of CNN in deep learning?
V

1

9

In Convolutional Neural Network (CNN), a filter is select for weights sharing. For example, in the following pictures, a 3x3 window with the stride (distance between adjacent neurons) 1 is chosen.

So my question is: How to choose the window size? If I use 4x4 with the stride being 2, how much difference will it cause? Thanks a lot in advance!

Vaughan answered 31/10, 2017 at 6:50 Comment(0)
A
14

There's no definite answer to this: filter size is one of hyperparameters you generally need to tune. However, there're some useful observations, that may help you. It's often preferred to choose smaller filters, but have greater number of those.

Example: four 5x5 filters have 100 parameters (ignoring bias), while 10 3x3 filters have 90 parameters. Through the larger of filters you still can capture the variety of features in the image, but with fewer parameters. More on this here.

Modern CNNs go even further with this idea and choose consecutive 3x1 and 1x3 convolutional layers. This reduces the number of parameters even more, but doesn't affect the performance. See the evolution of inception network.

The choice of stride is also important, but it affects the tensor shape after the convolution, hence the whole network. The general rule is to use stride=1 in usual convolutions and preserve the spatial size with padding, and use stride=2 when you want to downsample the image.

Anodic answered 1/11, 2017 at 9:3 Comment(1)
in my experience, downsampling is better achieved with maxpooling because the gradients are larger compared to strides (you may face gradient vanishing otherwise). Also, downsampling with strides is not selecting the interesting areas.Tutankhamen

© 2022 - 2024 — McMap. All rights reserved.