Many to one and many to many LSTM examples in Keras

Asked 26/3, 2017 at 21:47 Answered 9/7, 2019 at 7:43

Solved machine-learning neural-network deep-learning keras recurrent-neural-network

162

I try to understand LSTMs and how to build them with Keras. I found out, that there are principally the 4 modes to run a RNN (the 4 right ones in the picture)

Image source: Andrej Karpathy

Now I wonder how a minimalistic code snippet for each of them would look like in Keras. So something like

model = Sequential()
model.add(LSTM(128, input_shape=(timesteps, data_dim)))
model.add(Dense(1))

for each of the 4 tasks, maybe with a little bit of explanation.

Wartburg answered 26/3, 2017 at 21:47 Comment(1)

For the diagram of the one-to-many architecture, the RNN units to the right of the first X input also have inputs which are required. They can typically be set as the outputs (o or y) from the previous unit or a default zero vector – Geothermal 12/1, 2023 at 16:15

181

So:

One-to-one: you could use a Dense layer as you are not processing sequences:
```
model.add(Dense(output_size, input_shape=input_shape))
```
One-to-many: this option is not supported well as chaining models is not very easy in Keras, so the following version is the easiest one:
```
model.add(RepeatVector(number_of_times, input_shape=input_shape))
model.add(LSTM(output_size, return_sequences=True))
```
Many-to-one: actually, your code snippet is (almost) an example of this approach:
```
model = Sequential()
model.add(LSTM(1, input_shape=(timesteps, data_dim)))
```
Many-to-many: This is the easiest snippet when the length of the input and output matches the number of recurrent steps:
```
model = Sequential()
model.add(LSTM(1, input_shape=(timesteps, data_dim), return_sequences=True))
```
Many-to-many when number of steps differ from input/output length: this is freaky hard in Keras. There are no easy code snippets to code that.

EDIT: Ad 5

In one of my recent applications, we implemented something which might be similar to many-to-many from the 4th image. In case you want to have a network with the following architecture (when an input is longer than the output):

                                        O O O
                                        | | |
                                  O O O O O O
                                  | | | | | | 
                                  O O O O O O

You could achieve this in the following manner:

model = Sequential()
model.add(LSTM(1, input_shape=(timesteps, data_dim), return_sequences=True))
model.add(Lambda(lambda x: x[:, -N:, :])) #Select last N from output

Where N is the number of last steps you want to cover (on image N = 3).

From this point getting to:

                                        O O O
                                        | | |
                                  O O O O O O
                                  | | | 
                                  O O O

is as simple as artificial padding sequence of length N using e.g. with 0 vectors, in order to adjust it to an appropriate size.

Arundinaceous answered 27/3, 2017 at 13:19 Comment(12)

One clarification: For example for many to one, you use LSTM(1, input_shape=(timesteps, data_dim))) I thought the 1 stands for the number of LSTM cells/hidden nodes, but apperently not How would you code a Many-to-one with lets say, 512 nodes though than? (Because I read something simliar I thought it would be done with model.add(LSTM(512, input_shape=...)) model.add(Dense(1)) what is that used for than?) – Wartburg 27/3, 2017 at 13:31

In this case - your code - after correcting a typo should be ok. – Flita 27/3, 2017 at 13:34

Why do we use the RepeatVector, and not a vector with the first entry 1= 0 and all the other entries = 0 (according to the picture above, the is no Input at all at the later states, and not always the same input, what Repeat Vector would do in my understanding) – Wartburg 27/3, 2017 at 14:9

If you think carefully about this picture - it's only a conceptual presentation of an idea of one-to-many. All of this hidden units must accept something as an input. So - they might accept the same input as well input with the first input equal to x and other equal to 0. But - on the other hand - they might accept the same x repeated many times as well. Different approach is to chain models which is hard in Keras. The option I provided is the easiest case of one-to-many architecture in Keras. – Flita 27/3, 2017 at 14:15

Nice ! Iam thinking about using LSTM N to N in a GAN architecture. I will have a LSTM based generator. I will give this generetor (as used in "Latent variable" in gans) the first half of the time series and this generator will produce the second half of the time series. Then I will combine the two halfs (real and generated) to produce the "fake" input for the gan. Do you think using the poin 4 of you soluction will work ? or, in another words, is this (solution 4) the right way to do this ? – Devotion 19/11, 2018 at 20:54

@MarcinMożejko In your 'one-many' scenario how are you connecting the repeatvector with the LSTM layer?How should I set value of ''number_of_times'' in repeatetvector? Doesn't Keras by itself, finds the no. of time-steps required to build the model, and then repeat the input vector that no. of times? – Sinker 31/12, 2018 at 9:37

@MarcinMożejko so you don't need to explicitly tell the model the length of the output sequence? You just use return_sequence=True and it infers the rest? – Dragonnade 9/1, 2019 at 20:32

in the many-to-many example, there are 2 cases (in the op). One with and one without "offset". how do the models compare in a sample implementation? What is the difference when implementing those? – Clement 21/6, 2019 at 9:41

How do you do the second Many 2 Many? to you put a mask on the last timesteps? – Dermatologist 7/4, 2020 at 17:50

How is your Many-to-one different from Many-to-many? I mean what difference does adding return_sequences=True make? – Analyst 14/11, 2020 at 23:42

Could you help in how to correctly feed the multidimensional data for many to many or autoencoder model? Let's say we have a total data set stored in an array with a shape (45000, 100, 6) = (Nsample, Ntimesteps, Nfeatures) i.e. we have a 45000 samples with 100 time steps and 6 features. – Senhor 12/1, 2022 at 21:6

In many to many with unequal input and output can we use number 4 with padding? – Mangosteen 15/6, 2022 at 10:8

Great Answer by @Marcin Możejko

I would add the following to NR.5 (many to many with different in/out length):

A) as Vanilla LSTM

model = Sequential()
model.add(LSTM(N_BLOCKS, input_shape=(N_INPUTS, N_FEATURES)))
model.add(Dense(N_OUTPUTS))

B) as Encoder-Decoder LSTM

model.add(LSTM(N_BLOCKS, input_shape=(N_INPUTS, N_FEATURES))  
model.add(RepeatVector(N_OUTPUTS))
model.add(LSTM(N_BLOCKS, return_sequences=True))  
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))

Lavernalaverne answered 9/7, 2019 at 7:43 Comment(2)

Could you please explain the details of the B) Encoder-Decoder LSTM architecture? I'm having issues understanding the roles of "RepeatVector" / "TimeDistributed" steps. – Samale 1/5, 2020 at 19:35

Could you please help in how to correctly feed the multidimensional data for many to many or encoder-decoder model? I'm mostly struggling with shape. Let's say we have a total data set stored in an array with a shape (45000, 100, 6) = (Nsample, Ntimesteps, Nfeatures) i.e. we have a 45000 samples with 100 time steps and 6 features. – Senhor 12/1, 2022 at 21:8

Recommended topics

Hot tags