As I know, DropoutWrapper is used as follows
__init__(
cell,
input_keep_prob=1.0,
output_keep_prob=1.0,
state_keep_prob=1.0,
variational_recurrent=False,
input_size=None,
dtype=None,
seed=None
)
.
cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.DropoutWrapper(cell, output_keep_prob=0.5)
cell = tf.nn.rnn_cell.MultiRNNCell([cell] * num_layers, state_is_tuple=True)
the only thing I know is that it is use for dropout while training. Here are my three questions
What are input_keep_prob,output_keep_prob and state_keep_prob respectively? (I guess they define dropout probability of each part of RNN, but exactly where?)
Is dropout in this context applied to RNN not only when training but also prediction process? If it's true, is there any way to decide whether I do or don't use dropout at prediction process?
- As API documents in tensorflow web page, if variational_recurrent=True dropout works according to the method on a paper "Y. Gal, Z Ghahramani. "A Theoretically Grounded Application of Dropout in Recurrent Neural Networks". https://arxiv.org/abs/1512.05287 " I understood this paper roughly. When I train RNN, I use 'batch' not single time-series. In this case, tensorflow automatically assign different dropout mask to different time-series in a batch?