Difference between bidirectional_dynamic_rnn and stack_bidirectional_dynamic_rnn in Tensorflow
Asked Answered
C

1

13

I am building a dynamic RNN network with stacking multiple LSTMs. I see there are 2 options

# cells_fw and cells_bw are list of cells eg LSTM cells
stacked_cell_fw = tf.contrib.rnn.MultiRNNCell(cells_fw)
stacked_cell_bw = tf.contrib.rnn.MultiRNNCell(cells_bw)

output = tf.nn.bidirectional_dynamic_rnn(
          stacked_cell_fw, stacked_cell_bw, INPUT,
          sequence_length=LENGTHS, dtype=tf.float32)

vs

output = tf.contrib.rnn.stack_bidirectional_dynamic_rnn(cells_fw, cells_bw, INPUT,
sequence_length=LENGTHS, dtype=tf.float32)

What is the difference between the 2 approaches and is one better than the other?

Conjugate answered 12/3, 2018 at 18:35 Comment(1)
I would rephrase the title to Difference between bidirectional_dynamic_rnn and stack_bidirectional_dynamic_rnn in TensorflowSumatra
R
34

If you want to have have multiple layers that pass the information backward or forward in time, there are two ways how to design this. Assume the forward layer consists of two layers F1, F2 and the backword layer consists of two layers B1, B2.

If you use tf.nn.bidirectional_dynamic_rnn the model will look like this (time flows from left to right):

enter image description here

If you use tf.contrib.rnn.stack_bidirectional_dynamic_rnn the model will look like this:

enter image description here

Here the black dot between first and second layer represents a concatentation. I.e., the outputs of the forward and backward cells are concatenated together and fed to the backward and forward layers of the next upper layer. This means both F2 and B2 receive exactly the same input and there is an explicit connection between backward and forward layers. In "Speech Recognition with Deep Recurrent Neural Networks" Graves et al. summarize this as follows:

... every hidden layer receives input from both the forward and backward layers at the level below.

This connection only happens implicitly in the unstacked BiRNN (first image), namely when mapping back to the output. The stacked BiRNN usually performed better for my purposes, but I guess that depends on your problem setting. But for sure it is worthwile to try it out!

EDIT

In response to your comment: I base my answer on the documentation of the function tf.contrib.rnn.stack_bidirectional_dynamic_rnn which says:

Stacks several bidirectional rnn layers. The combined forward and backward layer outputs are used as input of the next layer. tf.bidirectional_rnn does not allow to share forward and backward information between layers.

Also, I looked at the implementation available under this link.

Recor answered 27/5, 2018 at 13:14 Comment(5)
Can I ask: How did you figure this out?Conjugate
also, how did you draw the pics ? :) very clean !Cloying
@CiprianTomoiagă Thanks :) this was done with Google Slides.Recor
@kaufmanu, thanks for your elaborate answer. I am unsure whether your 2nd diagram is correct. Whereas F2 is fed by F1 and B1 (i.e. combined forward and backward layer outputs are used as inputs, as definition says), this does not hold for B2 which is only fed by B1 but not by F1. Hence, I suggest to add arrows from F1 to B2 to your diagram in order to make it correct. In case I am wrong, please point me to my mistake. In case you agree, you could also pass me the Google Slides, then I could edit your post with the updated version.Sumatra
@Sumatra You are right, the figure was a bit misleading in this regard. I updated it and added some comments, hope it's clear now.Recor

© 2022 - 2024 — McMap. All rights reserved.