In Keras what is the difference between Conv2DTranspose and Conv2D
Asked Answered
G

1

9

I'm currently building a GAN with Tensorflow 2 and Keras and noticed a lot of the existing Neural Networks for the generator and discriminator use Conv2D and Conv2DTranspose in Keras.

I'm struggling to find something that functionally explains the difference between the two. Can anyone explain what these two different options for making a NN in Keras mean?

Gatewood answered 29/8, 2021 at 20:42 Comment(0)
O
25

Conv2D applies Convolutional operation on the input. On the contrary, Conv2DTranspose applies a Deconvolutional operation on the input.

  • Conv2D is mainly used when you want to detect features, e.g., in the encoder part of an autoencoder model, and it may shrink your input shape.
  • Conversely, Conv2DTranspose is used for creating features, for example, in the decoder part of an autoencoder model for constructing an image. As you can see in the code below, it makes the input shape larger.
x = tf.random.uniform((1,3,3,1))
conv2d = tf.keras.layers.Conv2D(1,2)(x)
print(conv2d.shape)
# (1, 2, 2, 1)
conv2dTranspose = tf.keras.layers.Conv2DTranspose(1,2)(x)
print(conv2dTranspose.shape)
# (1, 4, 4, 1)

To sum up:

  • Conv2D:
    • May shrink your input
    • For detecting features
  • Conv2DTranspose:
    • Enlarges your input
    • For constructing features

enter image description here

And if you want to know how Conv2DTranspose enlarges input, here you go: enter image description here

For example:

kernel = tf.constant_initializer(1.)
x = tf.ones((1,3,3,1))
conv = tf.keras.layers.Conv2D(1,2, kernel_initializer=kernel)
y = tf.ones((1,2,2,1))
de_conv = tf.keras.layers.Conv2DTranspose(1,2, kernel_initializer=kernel)

conv_output = conv(x)
print("Convolution\n---------")
print("input  shape:",x.shape)
print("output shape:",conv_output.shape)
print("input  tensor:",np.squeeze(x.numpy()).tolist())
print("output tensor:",np.around(np.squeeze(conv_output.numpy())).tolist())
'''
Convolution
---------
input  shape: (1, 3, 3, 1)
output shape: (1, 2, 2, 1)
input  tensor: [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]
output tensor: [[4.0, 4.0], [4.0, 4.0]]
'''
de_conv_output = de_conv(y)
print("De-Convolution\n------------")
print("input  shape:",y.shape)
print("output shape:",de_conv_output.shape)
print("input  tensor:",np.squeeze(y.numpy()).tolist())
print("output tensor:",np.around(np.squeeze(de_conv_output.numpy())).tolist())
'''
De-Convolution
------------
input  shape: (1, 2, 2, 1)
output shape: (1, 3, 3, 1)
input  tensor: [[1.0, 1.0], [1.0, 1.0]]
output tensor: [[1.0, 2.0, 1.0], [2.0, 4.0, 2.0], [1.0, 2.0, 1.0]]
'''
Obstinacy answered 30/8, 2021 at 7:45 Comment(2)
It's common to see code with several conv2D layers followed by several conv2DTranspose layers. The latter are supposed to revert the effect of the former. Then why do we use them if we are just getting back the original input?Copulative
@Copulative Hi. Thanks for your question. There can be many applications of such architecture. For example an encoder-decoder or segmentation in a U-Net-like architecture.Obstinacy

© 2022 - 2024 — McMap. All rights reserved.