This is the API I am looking at, https://pytorch.org/docs/stable/nn.html#gru
It outputs:
output
of shape (seq_len, batch, num_directions * hidden_size)h_n
of shape (num_layers * num_directions, batch, hidden_size)
For GRU with more than one layers, I wonder how to fetch the hidden state of the last layer, should it be h_n[0]
or h_n[-1]
?
What if it's bidirectional, how to do the slicing to obtain the last hidden layer states of GRUs in both directions?
h_n[-1]
. Just confirmed myself – Armenian