In Keras the Bidirectional
wrapper for RNNs also supports stateful=true
. I don't really understand how this is supposed to work:
In a stateful unidirectional model the state of a batch is carried over to the next batch. I guess it works the same for the forward layer in the bidirectional model.
But where is the backward layer getting it's states from? If I understand everything correctly it should technically recieve it's state from the "next" batch. But obviously the "next" batch is not computet yet, so how does it work?