Why does Keras' train_on_batch produce zero loss and accuracy at the second epoch?

Asked 31/5, 2016 at 10:26 Answered 23/5, 2020 at 16:42

python batch-processing training-data keras lstm

I am using a big dataset, and so I'm trying to use train_on_batch(or fit with epoch = 1)

model = Sequential()
model.add(LSTM(size,input_shape=input_shape,return_sequences=False))
model.add(Dense(output_dim))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=["accuracy"])

for e in range(nb_epoch):
    for batch_X, batch_y in batches:
        model.train_on_batch(batch_X,batch_y)
        # or
        # model.fit(batch_X,batch_y,batch_size=batch_size,nb_epoch=1,verbose=1,shuffle=True,)

But when training starts, this happens:

(0, 128)
Epoch 1/1
128/128 [==============================] - 2s - loss: 0.3262 - acc: 0.1130

(129, 257)
Epoch 1/1
128/128 [==============================] - 2s - loss: -0.0000e+00 - acc: 0.0000e+00

It doesn't matter how many epochs I wait, it doesn't change. Even If I change the batch size, the same thing happens: The first batch has good values and then it just goes to "loss: -0.0000e+00 - acc: 0.0000e+00" again.

Can someone maybe help in understanding what's happening here?

Goodygoody answered 31/5, 2016 at 10:26 Comment(4)

This might happen if your training data contains very small amount of unique examples and your network learn all of them in its first batches. Maybe you've accidentally put identical elements by using an array reference instead of copies in your dataset creation script. – Anuradhapura 31/5, 2016 at 12:55

Yeah, take a look at the predictions and labels and see if the network is actually getting 0 accuracy. That will help you debug. – Mayers 31/5, 2016 at 16:56

@DmitryKostyaev Identical elements. It was a tiny mistake, I feel silly. Thank you for the help. – Goodygoody 2/6, 2016 at 14:50

Voted to close: (1) Keras has changed immensely since 4 years ago; (2) not enough debug details; (3) this is the only question on OP's account, so unlikely (2) is ever getting addressed. – Austen 12/7, 2020 at 14:11

This seems like the exploding/vanishing gradient problem. Like someone said try tuning your learning rate and/or the depth/width of your NN layers

Snowstorm answered 23/5, 2020 at 16:42 Comment(0)

if there are too many categories and if your data set is not good; When the system does not find a good result, it automatically prints the local minima. Try to change the learning rate.

opt = keras.optimizers.Adam(learning_rate=0.01)
model.compile(loss='categorical_crossentropy', optimizer=opt)

Portraiture answered 13/5, 2020 at 16:32 Comment(0)

Recommended topics

Hot tags