Overfitting after one epoch
Asked Answered
C

1

9

I am training a model using Keras.

model = Sequential()
model.add(LSTM(units=300, input_shape=(timestep,103), use_bias=True, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(units=536))
model.add(Activation("sigmoid"))
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])

while True:
        history = model.fit_generator( 
            generator = data_generator(x_[train_indices],
                    y_[train_indices], batch = batch, timestep=timestep),
                steps_per_epoch=(int)(train_indices.shape[0] / batch), 
                epochs=1, 
                verbose=1, 
                validation_steps=(int)(validation_indices.shape[0] / batch), 
                validation_data=data_generator(
                    x_[validation_indices],y_[validation_indices], batch=batch,timestep=timestep))

It is a multiouput classification accoriding to scikit-learn.org definition: Multioutput regression assigns each sample a set of target values.This can be thought of as predicting several properties for each data-point, such as wind direction and magnitude at a certain location.

Thus, it is a recurrent neural network I tried out different timestep sizes. But the result/problem is mostly the same.

After one epoch, my train loss is around 0.0X and my validation loss is around 0.6X. And this values keep stable for the next 10 epochs.

Dataset is around 680000 rows. Training data is 9/10 and validation data is 1/10.

I ask for intuition behind that..

  • Is my model already over fittet after just one epoch?
  • Is 0.6xx even a good value for a validation loss?

High level question: Therefore it is a multioutput classification task (not multi class), I see the only way by using sigmoid an binary_crossentropy. Do you suggest an other approach?

Chicle answered 22/5, 2017 at 12:35 Comment(5)
I'm not sure - but I think that maximal value of binary_crossentropy is equal to 0.7 in your case. So valid loss might be pretty high. I would try to increase the dropout rate and check if such phenomenon still occurs.Arose
Thank you, I will try to increase the Dropout. Could you point out that 0.7 max value?Chicle
I increased Dropout and Recurrent_Dropout to 0.4. But after one epoch I get a similar result: loss: 0.0347 - acc: 0.9885 - val_loss: 0.6998 - val_acc: 0.9193Chicle
Try 0.7. And then bin search for the best one if it not overfit.Arose
Used even a 0.9 each, same picture. loss: 0.1068 - acc: 0.9645 - val_loss: 0.4846 - val_acc: 0.9199. Thank you in advanceChicle
I
10

I've experienced this issue and found that the learning rate and batch size have a huge impact on the learning process. In my case, I've done two things.

  • Reduce the learning rate (try 0.00005)
  • Reduce the batch size (8, 16, 32)

Moreover, you can try the basic steps for preventing overfitting.

  • Reduce the complexity of your model
  • Increase the training data and also balance each sample per class.
  • Add more regularization (Dropout, BatchNorm)
Illfated answered 6/6, 2019 at 11:54 Comment(1)
Generally I see your point, but in which way has the batch size influence on this?Lenten

© 2022 - 2024 — McMap. All rights reserved.