Keras functional API: Combine CNN model with a RNN to to look at sequences of images

So i was stuck with a question on how to combine a CNN with a RNN in Keras. While posting the question someone pointed me out that this is the correct way to approach the problem. Apparently i just overlooked something in the original code, which made me answer my own question.

The original problem is as follows:

How do you create a model in Keras that has sequences of images as the input, with a CNN 'looking' at each individual image and the sequence of the CNN output being fed into a RNN?

To make it more clear:

Model one: a CNN that looks at single images.
Model two: a RNN that at the sequences of the output of the CNN from model one.

So for example the CNN should see 5 images and this sequence of 5 outputs from the CNN should be passed on to the RNN.

The input data is in the following format:
(number_of_images, width, height, channels) = (4000, 120, 60, 1)

The answer to this question is as follows.

Take this oversimplified CNN model:

cnn = Sequential()
cnn.add(Conv2D(16, (50, 50), input_shape=(120, 60, 1)))

cnn.add(Conv2D(16, (40, 40)))

cnn.add(Flatten()) # Not sure if this if the proper way to do this.

Then there is this simple RNN model:

rnn = Sequential()

rnn = GRU(64, return_sequences=False, input_shape=(120, 60))

Which should be connected to a dense network:

dense = Sequential()
dense.add(Dense(128))
dense.add(Dense(64))

dense.add(Dense(1)) # Model output

Notice that activation functions and such have been left out for readability.

Now all that is left is combining these 3 main models.

main_input = Input(shape=(5, 120, 60, 1)) # Data has been reshaped to (800, 5, 120, 60, 1)

model = TimeDistributed(cnn)(main_input) # this should make the cnn 'run' 5 times?
model = rnn(model) # combine timedistributed cnn with rnn
model = dense(model) # add dense

Then finally

final_model = Model(inputs=main_input, outputs=model)

final_model.compile...
final_model.fit...

Recommended topics

Hot tags