LSTM Followed by Mean Pooling

Asked 5/4, 2016 at 13:50 Answered 1/11, 2020 at 10:28

Solved machine-learning neural-network deep-learning keras recurrent-neural-network

I'm using Keras 1.0. My problem is identical to this one (How to implement a Mean Pooling layer in Keras), but the answer there does not seem to be sufficient for me.

I want to implement this network:

The following code does not work:

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
pool = AveragePooling1D()(lstm)
output = Dense(1, activation='sigmoid')(pool)

If I don't set return_sequences=True, I get this error when I call AveragePooling1D():

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/PATH/keras/engine/topology.py", line 462, in __call__
    self.assert_input_compatibility(x)
  File "/PATH/keras/engine/topology.py", line 382, in assert_input_compatibility
    str(K.ndim(x)))
Exception: ('Input 0 is incompatible with layer averagepooling1d_6: expected ndim=3', ' found ndim=2')

Otherwise, I get this error when I call Dense():

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/PATH/keras/engine/topology.py", line 456, in __call__
    self.build(input_shapes[0])
  File "/fs/clip-arqat/mossaab/trec/liveqa/cmu/venv/lib/python2.7/site-packages/keras/layers/core.py", line 512, in build
    assert len(input_shape) == 2
AssertionError

Contumelious answered 5/4, 2016 at 13:50 Comment(0)

Adding TimeDistributed(Dense(1)) helped:

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
distributed = TimeDistributed(Dense(1))(lstm)
pool = AveragePooling1D()(distributed)
output = Dense(1, activation='sigmoid')(pool)

Contumelious answered 24/6, 2016 at 14:37 Comment(0)

I just attempted to implement the same model as the original poster, and I'm using Keras 2.0.3. The mean pooling after LSTM worked when I used GlobalAveragePooling1D, just make sure return_sequences=True in the LSTM layer. Give it a try!

Sumptuary answered 20/4, 2017 at 22:51 Comment(0)

Adding TimeDistributed(Dense(1)) helped:

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
distributed = TimeDistributed(Dense(1))(lstm)
pool = AveragePooling1D()(distributed)
output = Dense(1, activation='sigmoid')(pool)

Contumelious answered 24/6, 2016 at 14:37 Comment(0)

I think the accepted answer is basically wrong. A solution was found at: https://github.com/fchollet/keras/issues/2151 However, it only works for theano backend. I have modified the code so that it supports both theano and tensorflow.

from keras.engine.topology import Layer, InputSpec
from keras import backend as T

class TemporalMeanPooling(Layer):
    """
This is a custom Keras layer. This pooling layer accepts the temporal
sequence output by a recurrent layer and performs temporal pooling,
looking at only the non-masked portion of the sequence. The pooling
layer converts the entire variable-length hidden vector sequence
into a single hidden vector, and then feeds its output to the Dense
layer.

input shape: (nb_samples, nb_timesteps, nb_features)
output shape: (nb_samples, nb_features)
"""
def __init__(self, **kwargs):
    super(TemporalMeanPooling, self).__init__(**kwargs)
    self.supports_masking = True
    self.input_spec = [InputSpec(ndim=3)]

def get_output_shape_for(self, input_shape):
    return (input_shape[0], input_shape[2])

def call(self, x, mask=None): #mask: (nb_samples, nb_timesteps)
    if mask is None:
        mask = T.mean(T.ones_like(x), axis=-1)
    ssum = T.sum(x,axis=-2) #(nb_samples, np_features)
    mask = T.cast(mask,T.floatx())
    rcnt = T.sum(mask,axis=-1,keepdims=True) #(nb_samples)
    return ssum/rcnt
    #return rcnt

def compute_mask(self, input, mask):
    return None

Calorimeter answered 28/11, 2016 at 22:50 Comment(0)

Thanks, I also meet the question, but I think TimeDistributed layer not working as you want, you can try Luke Guye's TemporalMeanPooling layer, it works for me. Here is the example:

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, return_sequences=True)(embedded)
pool = TemporalMeanPooling()(lstm)
output = Dense(1, activation='sigmoid')(pool)

Mare answered 25/3, 2017 at 8:31 Comment(0)

Quite late to the party, but tf.keras.layers.AveragePooling1D with suitable pool_size parameter also seems to return the correct result.

Working on the example shared by bobchennan on this issue.

# create sample data
A=np.array([[1,2,3],[4,5,6],[0,0,0],[0,0,0],[0,0,0]])
B=np.array([[1,3,0],[4,0,0],[0,0,1],[0,0,0],[0,0,0]])
C=np.array([A,B]).astype("float32")
# expected answer (for temporal mean)
np.mean(C, axis=1)

The output is

array([[1. , 1.4, 1.8],
       [1. , 0.6, 0.2]], dtype=float32)

Now using AveragePooling1D,

model = keras.models.Sequential(
        tf.keras.layers.AveragePooling1D(pool_size=5)
)
model.predict(C)

The output is,

array([[[1. , 1.4, 1.8]],
       [[1. , 0.6, 0.2]]], dtype=float32)

Some points to consider,

The pool_size should be equal to the step/timesteps size of the recurrent layer.
The shape of the output is (batch_size, downsampled_steps, features), which contains one additional downsampled_steps dimension. This will be always 1 if you set the pool_size equal to timestep size in recurrent layer.

Oceanography answered 1/11, 2020 at 10:28 Comment(0)

Recommended topics

Hot tags