Tensorflow LSTM Dropout Implementation
Asked Answered
T

1

9
  • How specifically does tensorflow apply dropout when calling tf.nn.rnn_cell.DropoutWrapper() ?

Everything I read about applying dropout to rnn's references this paper by Zaremba et. al which says don't apply dropout between recurrent connections. Neurons should be dropped out randomly before or after LSTM layers, but not inter-LSTM layers. Ok.

  • The question I have is how are the neurons turned off with respect to time?

In the paper that everyone cites, it seems that a random 'dropout mask' is applied at each timestep, rather than generating one random 'dropout mask' and reusing it, applying it to all the timesteps in a given layer being dropped out. Then generating a new 'dropout mask' on the next batch.

Further, and probably what matters more at the moment, how does tensorflow do it? I've checked the tensorflow api and tried searching around for a detailed explanation but have yet to find one.

  • Is there a way to dig into the actual tensorflow source code?
Tamishatamma answered 27/2, 2017 at 14:40 Comment(1)
All the source code is available in githubShivery
I
4

You can check the implementation here.

It uses the dropout op on the input into the RNNCell, then on the output, with the keep probabilities you specify.

It seems like each sequence you feed in gets a new mask for input, then for output. No changes inside of the sequence.

Inefficacy answered 27/2, 2017 at 17:19 Comment(1)
Thanks. That indeed lead to me the answer, dropout isn't applied recurrently when using it in rnn's in tensorflow. For anyone wanting this feature, I opened up an issue on github, Issue #7927.Tamishatamma

© 2022 - 2024 — McMap. All rights reserved.