Using dilated convolution in Keras
Asked Answered
C

2

5

In WaveNet, dilated convolution is used to increase receptive field of the layers above.

Dilated convolution

From the illustration, you can see that layers of dilated convolution with kernel size 2 and dilation rate of powers of 2 create a tree like structure of receptive fields. I tried to (very simply) replicate the above in Keras.

import tensorflow.keras as keras
nn = input_layer = keras.layers.Input(shape=(200, 2))
nn = keras.layers.Conv1D(5, 5, padding='causal', dilation_rate=2)(nn)
nn = keras.layers.Conv1D(5, 5, padding='causal', dilation_rate=4)(nn)
nn = keras.layers.Dense(1)(nn)
model = keras.Model(input_layer, nn)
opt = keras.optimizers.Adam(lr=0.001)
model.compile(loss='mse', optimizer=opt)
model.summary()

And the output:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_4 (InputLayer)         [(None, 200, 2)]          0
_________________________________________________________________
conv1d_5 (Conv1D)            (None, 200, 5)            55
_________________________________________________________________
conv1d_6 (Conv1D)            (None, 200, 5)            130
_________________________________________________________________
dense_2 (Dense)              (None, 200, 1)            6
=================================================================
Total params: 191
Trainable params: 191
Non-trainable params: 0
_________________________________________________________________

I was expecting axis=1 to shrink after each conv1d layer, similar to the gif. Why is this not the case?

Citric answered 24/7, 2020 at 12:39 Comment(0)
F
6

The model summary is as expected. As you note using dilated convolutions results in an increase in the receptive field. However, dilated convolution actually preserves the output shape of our input image/activation as we are just changing the convolutional kernel. A regular kernel could be the following

0 1 0
1 1 1
0 1 0

A kernel with a dilation rate of 2 would add zeros in between each entry in our original kernel as below.

0 0 1 0 0
0 0 0 0 0
1 0 1 0 1
0 0 0 0 0
0 0 1 0 0

In fact you can see that our original kernel is also a dilated kernel with a dilation rate of 1. Alternative ways to increase the receptive field result in a downsizing of the input image. Max pooling and strided convolution are 2 alternative methods.

For example. if you want to increase the receptive field by decreasing the size of your output shape you could use strided convolution as below. I replace the dilated convolution with a strided convolution. You will see that the output shape reduces every layer.

import tensorflow.keras as keras
nn = input_layer = keras.layers.Input(shape=(200, 2))
nn = keras.layers.Conv1D(5, 5, padding='causal', strides=2)(nn)
nn = keras.layers.Conv1D(5, 5, padding='causal', strides=4)(nn)
nn = keras.layers.Dense(1)(nn)
model = keras.Model(input_layer, nn)
opt = keras.optimizers.Adam(lr=0.001)
model.compile(loss='mse', optimizer=opt)
model.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_2 (InputLayer)         [(None, 200, 2)]          0
_________________________________________________________________
conv1d_3 (Conv1D)            (None, 100, 5)            55
_________________________________________________________________
conv1d_4 (Conv1D)            (None, 25, 5)             130
_________________________________________________________________
dense_1 (Dense)              (None, 25, 1)             6
=================================================================
Total params: 191
Trainable params: 191
Non-trainable params: 0
_________________________________________________________________

To summarize dilated convolution is just another way to increase the receptive field of your model. It has the benefit of preserving the output shape of your input image.

Frampton answered 24/7, 2020 at 14:14 Comment(1)
It should probably be added that dilated convolution is usually used with stride. Dilated convolutions change the receptive field of a kernel, whereas stride changes the output shape so the next layer has a bigger receptive field. Dilation alone doesn't change the receptive field a whole lot when used across multiple layers without stride. But unfortunately Keras doesn't support dilated conv with stride.Disposition
E
0

Here's an example of this dialtion with 1D Convolutional layers, output has size 14:

https://github.com/jwallbridge/translob/blob/master/python/LobFeatures.py

def lob_dilated(x):
  """
  TransLOB dilated 1-D convolution module
  """
  x = layers.Conv1D(14,kernel_size=2,strides=1,activation='relu',padding='causal')(x)   
  x = layers.Conv1D(14,kernel_size=2,dilation_rate=2,activation='relu',padding='causal')(x)
  x = layers.Conv1D(14,kernel_size=2,dilation_rate=4,activation='relu',padding='causal')(x)
  x = layers.Conv1D(14,kernel_size=2,dilation_rate=8,activation='relu',padding='causal')(x)
  y = layers.Conv1D(14,kernel_size=2,dilation_rate=16,activation='relu',padding='causal')(x)

  return y
Ermentrude answered 4/7, 2022 at 15:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.