What is the connections between two stacked LSTM layers?

About

Asked 16/5, 2020 at 23:29 Answered 17/5, 2020 at 17:0

Solved machine-learning deep-learning lstm recurrent-neural-network tf.keras

The question is like this one What's the input of each LSTM layer in a stacked LSTM network?, but more into implementing details.

For simplicity how about 4 units and 2 units structures like the following

model.add(LSTM(4, input_shape=input_shape,  return_sequences=True))
model.add(LSTM(2,input_shape=input_shape))

So I know the output of LSTM_1 is 4 length but how do the next 2 units handle these 4 inputs, are they fully connected to the next layer of nodes?

I guess they are fully connected but not sure like the following figure, it was not stated in the Keras document

Thanks!

Puke answered 16/5, 2020 at 23:29 Comment(2)

Does this answer your question? Understanding Keras LSTMs – Hambrick 17/5, 2020 at 10:55

@ZabirAlNazi Thx, I checked it but it still does not explain the stacked LSTM ones. It says "You can, of course, stack many layers on top of each other, not necessarily all following the same pattern, and create your own models." I am exactly interested about this. How to pass the results between layers. Thanks! – Puke 17/5, 2020 at 12:32

It's not length 4, it's 4 "features".

The length is in the input shape and it never changes, there is absolutely no difference between what happens when you give a regular input to one LSTM and what happens when you give an output of an LSTM to another LSTM.

You can just look at the model's summary to see the shapes and understand what is going on. You never change the length using LSTMs.

They don't communicate at all. Each one takes the length dimension, processes it recurrently, independently from the other. When one finishes and outputs a tensor, the next one gets the tensor and process it alone following the same rules.

Urogenital answered 17/5, 2020 at 17:0 Comment(6)

Thanks, this helps! it is feature, not time steps. – Puke 18/5, 2020 at 17:39

Can you please answer my this question https://mcmap.net/q/83096/-steps_per_epoch-and-validation_steps-for-infinite-dataset-in-keras-model/7344164 – Austerlitz 29/1, 2021 at 15:34

Thanks! To ask a follow-up question, what if you stack a bunch of LSTM layers followed by some other layers that re-shape the data? Then if you are thinking about a time series, how does the previous state (which has now been reshaped as part of the sequence) cycle back to the start of the LSTM chain for the next timestep (Do the states cycle internally within each of the LSTM cells)? – Driftwood 10/5, 2021 at 3:43

@mfgeng, each LSTM is concealed within itself, the states are not outputs and are not affected by other layers. – Among 10/5, 2021 at 16:45

@DanielMöller, what will be the input to the second layer of LSTM if no communication is happening. Suppose I have input_data = (1,3), ie 1 timestep and 3 features, with first lstm layer has 2 units, and second has a 5 units. Then, if the hidden state from 1st layer does not go to the next layer. what is going ? does the output of 1st layer(not hidden state) go as input_data to 2nd layer with new hidden state with dim 5 (initialized to zeros) ? I am trying to understand LSTM – Fatma 5/11, 2023 at 17:14

@DanielMöller, could you please answer this question [#77427727 – Fatma 5/11, 2023 at 18:51

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags