Why use fixed padding when building resnet model in tensorflow
Asked Answered
Y

2

9

Tensorflow has an official realization of resnet in github. And it uses fixed padding instead of normal tf.layers.conv2d.

Something like this:

def conv2d_fixed_padding(inputs, filters, kernel_size, strides, data_format):
  """Strided 2-D convolution with explicit padding."""
  # The padding is consistent and is based only on `kernel_size`, not on the
  # dimensions of `inputs` (as opposed to using `tf.layers.conv2d` alone).
  if strides > 1:
    inputs = fixed_padding(inputs, kernel_size, data_format)

  return tf.layers.conv2d(
      inputs=inputs, filters=filters, kernel_size=kernel_size, strides=strides,
      padding=('SAME' if strides == 1 else 'VALID'), use_bias=False,
      kernel_initializer=tf.variance_scaling_initializer(),
      data_format=data_format)

What's the purpose of doing this? We can get a 16x16 feature map if we input a image of size 32x32 and use tf.layer.conv2d setting padding method to SAME, stride 2. But in the code above, it will pad zero in both side of image and then use padding method VALID.

Yb answered 11/12, 2017 at 1:42 Comment(1)
For dimension matching during convolution, details below.Ronnieronny
E
8

Let's assume we have stride of 2 and kernel size of 3.

Using tf.layers.conv2d with padding SAME:

Case 1:

                   pad|              |pad
       inputs:      0 |1  2  3  4  5 |0 
                   |_______|
                         |_______|
                               |_______|

Case 2:

                                     |pad
       inputs:      1  2  3  4  5  6 |0 
                   |_______|
                         |_______|
                               |_______|

You can see the padding will depend on the input size. The padding with same is determined such that the output size is Math.ceil(input_size / stride). You can read more about that here.

Using the fixed padding implementation of resnet:

Case 1:

                   pad|              |pad
       inputs:      0 |1  2  3  4  5 |0 
                   |_______|
                         |_______|
                               |_______|

Case 2:

                   pad|                 |pad
       inputs:      0 |1  2  3  4  5  6 |0 
                   |_______|
                         |_______|
                               |_______|

Padding is uniquely defined by the kernel size and stays independent of the input size.

Edibles answered 26/3, 2018 at 16:19 Comment(0)
R
1

As you know RNN has these skip connection, where network looks like following: enter image description here

and the equation becomes following:

F(x) + x   // Here 'x' is not input but the the kernel/filter. 

So with this addition we assume that the dimension of F(x) and x will be same. But if they are not so we must pad them for convolution to happen.

This is the reason you will see padding="SAME" padding for all the convolutions in ResNet TF model

Ronnieronny answered 17/3, 2018 at 12:6 Comment(2)
I think you misunderstood the question. The question asks for why official tf model write its own padding operation instead of using the tf.layers.conv2d's padding='SAME' parameter when the strides > 1Fleece
You don't need the whole conv2d_fixed_padding function at all if padding='SAME' is used instead of padding=('SAME' if strides == 1 else 'VALID') to get the same dimension.Fleece

© 2022 - 2024 — McMap. All rights reserved.