What is the connections between two stacked LSTM layers?
Asked Answered
P

1

2

The question is like this one What's the input of each LSTM layer in a stacked LSTM network?, but more into implementing details.

For simplicity how about 4 units and 2 units structures like the following

model.add(LSTM(4, input_shape=input_shape,  return_sequences=True))
model.add(LSTM(2,input_shape=input_shape))

So I know the output of LSTM_1 is 4 length but how do the next 2 units handle these 4 inputs, are they fully connected to the next layer of nodes?

I guess they are fully connected but not sure like the following figure, it was not stated in the Keras document

LSTM Connections

Thanks!

Puke answered 16/5, 2020 at 23:29 Comment(2)
Does this answer your question? Understanding Keras LSTMsHambrick
@ZabirAlNazi Thx, I checked it but it still does not explain the stacked LSTM ones. It says "You can, of course, stack many layers on top of each other, not necessarily all following the same pattern, and create your own models." I am exactly interested about this. How to pass the results between layers. Thanks!Puke
U
3

It's not length 4, it's 4 "features".

The length is in the input shape and it never changes, there is absolutely no difference between what happens when you give a regular input to one LSTM and what happens when you give an output of an LSTM to another LSTM.

You can just look at the model's summary to see the shapes and understand what is going on. You never change the length using LSTMs.

They don't communicate at all. Each one takes the length dimension, processes it recurrently, independently from the other. When one finishes and outputs a tensor, the next one gets the tensor and process it alone following the same rules.

Urogenital answered 17/5, 2020 at 17:0 Comment(6)
Thanks, this helps! it is feature, not time steps.Puke
Can you please answer my this question https://mcmap.net/q/83096/-steps_per_epoch-and-validation_steps-for-infinite-dataset-in-keras-model/7344164Austerlitz
Thanks! To ask a follow-up question, what if you stack a bunch of LSTM layers followed by some other layers that re-shape the data? Then if you are thinking about a time series, how does the previous state (which has now been reshaped as part of the sequence) cycle back to the start of the LSTM chain for the next timestep (Do the states cycle internally within each of the LSTM cells)?Driftwood
@mfgeng, each LSTM is concealed within itself, the states are not outputs and are not affected by other layers.Among
@DanielMöller, what will be the input to the second layer of LSTM if no communication is happening. Suppose I have input_data = (1,3), ie 1 timestep and 3 features, with first lstm layer has 2 units, and second has a 5 units. Then, if the hidden state from 1st layer does not go to the next layer. what is going ? does the output of 1st layer(not hidden state) go as input_data to 2nd layer with new hidden state with dim 5 (initialized to zeros) ? I am trying to understand LSTMFatma
@DanielMöller, could you please answer this question [#77427727Fatma

© 2022 - 2024 — McMap. All rights reserved.