what exactly does 'tf.contrib.rnn.DropoutWrapper'' in tensorflow do? ( three citical questions)
Asked Answered
A

2

10

As I know, DropoutWrapper is used as follows

__init__(
cell,
input_keep_prob=1.0,
output_keep_prob=1.0,
state_keep_prob=1.0,
variational_recurrent=False,
input_size=None,
dtype=None,
seed=None
)

.

cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.DropoutWrapper(cell, output_keep_prob=0.5)
cell = tf.nn.rnn_cell.MultiRNNCell([cell] * num_layers, state_is_tuple=True)

the only thing I know is that it is use for dropout while training. Here are my three questions

  1. What are input_keep_prob,output_keep_prob and state_keep_prob respectively? (I guess they define dropout probability of each part of RNN, but exactly where?)

  2. Is dropout in this context applied to RNN not only when training but also prediction process? If it's true, is there any way to decide whether I do or don't use dropout at prediction process?

  3. As API documents in tensorflow web page, if variational_recurrent=True dropout works according to the method on a paper "Y. Gal, Z Ghahramani. "A Theoretically Grounded Application of Dropout in Recurrent Neural Networks". https://arxiv.org/abs/1512.05287 " I understood this paper roughly. When I train RNN, I use 'batch' not single time-series. In this case, tensorflow automatically assign different dropout mask to different time-series in a batch?
Ammoniac answered 4/8, 2017 at 12:50 Comment(1)
During predictions you want to use keep_prob of 1.0. That's why normally it's easier to feed that value via a placeholder, not a constant.Whereof
B
2
  1. input_keep_prob is for the dropout level (inclusion probability) added when fitting feature weights. output_keep_prob is for the dropout level added for each RNN unit output. state_keep_prob is for the hidden state that is fed to the next layer.
  2. You can initialize each of the above mentioned parameters as follows:
import tensorflow as tf
dropout_placeholder = tf.placeholder_with_default(tf.cast(1.0, tf.float32))
tf.nn.rnn_cell.DropoutWrapper(tf.nn.rnn_cell.BasicRNNCell(n_hidden_rnn),

input_keep_prob = dropout_placeholder, output_keep_prob = dropout_placeholder, 
state_keep_prob = dropout_placeholder)

The default dropout level will be 1 during prediction or anything else that we can feed during training.

  1. The masking is done for the fitted weights rather than for the sequences that are included in the batch. As far as I know, it's done for the entire batch.
Bemis answered 12/5, 2019 at 15:49 Comment(0)
G
-3
keep_prob = tf.cond(dropOut,lambda:tf.constant(0.9), lambda:tf.constant(1.0))

cells = rnn.DropoutWrapper(cells, output_keep_prob=keep_prob) 
Gona answered 31/10, 2017 at 2:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.