Why is my val_accuracy stagnant at 0.0000e+00 while my val_loss is increasing from the start?

I am training a classification model to classify cells, and my model is based on this paper: https://www.nature.com/articles/s41598-019-50010-9. As my dataset consists of only 10 images, I performed image augmentation to artificially increase the size of the dataset to 3000 images, which were then split into 2400 training images and 600 validation images.

However, while the training loss and accuracy improved upon more iterations, the validation loss increased rapidly while validation accuracy remained stagnant at 0.0000e+00.

Is my model overfitting severely right from the start?

The code I used is as shown below:

import keras
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras.models import Model, load_model, Sequential, model_from_json, load_model
from tensorflow.keras.layers import Input, BatchNormalization, Activation, Flatten, Dense, LeakyReLU
from tensorflow.python.keras.layers.core import Lambda, Dropout
from tensorflow.python.keras.layers.convolutional import Conv2D, Conv2DTranspose, UpSampling2D
from tensorflow.python.keras.layers.pooling import MaxPooling2D, AveragePooling2D
from tensorflow.python.keras.layers.merge import Concatenate, Add
from tensorflow.keras import backend as K
from tensorflow.keras.callbacks import ModelCheckpoint, LearningRateScheduler, ReduceLROnPlateau, EarlyStopping
from tensorflow.keras.optimizers import *

img_channel = 1
input_size = (512, 512, 1)
inputs = Input(shape = input_size)
initial_input = Lambda(lambda x: x) (inputs) #Ensure input value is between 0 and 1 to avoid negative loss

kernel_size = (3,3)
pad = 'same'
model = Sequential()
filters = 2

c1 = Conv2D(filters, kernel_size, padding = pad, kernel_initializer = 'he_normal')(initial_input)
b1 = BatchNormalization()(c1) 
a1 = Activation('elu')(b1)
p1 = AveragePooling2D()(a1)

c2 = Conv2D(filters, kernel_size, padding = pad, kernel_initializer = 'he_normal')(p1)
b2 = BatchNormalization()(c2) 
a2 = Activation('elu')(b2)
p2 = AveragePooling2D()(a2)

c3 = Conv2D(filters, kernel_size, padding = pad, kernel_initializer = 'he_normal')(p2)
b3 = BatchNormalization()(c3) 
a3 = Activation('elu')(b3)
p3 = AveragePooling2D()(a3)

c4 = Conv2D(filters, kernel_size, padding = pad, kernel_initializer = 'he_normal')(p3)
b4 = BatchNormalization()(c4) 
a4 = Activation('elu')(b4)
p4 = AveragePooling2D()(a4)

c5 = Conv2D(filters, kernel_size, padding = pad, kernel_initializer = 'he_normal')(p4)
b5 = BatchNormalization()(c5) 
a5 = Activation('elu')(b5)
p5 = AveragePooling2D()(a5)

f = Flatten()(p5)

d1 = Dense(128, activation = 'elu')(f)
d2 = Dense(no_of_img, activation = 'softmax')(d1)

model = Model(inputs = [inputs], outputs = [d2])
print(model.summary())

learning_rate = 0.001
decay_rate = 0.0001
model.compile(optimizer = SGD(lr = learning_rate, decay = decay_rate, momentum = 0.9, nesterov = False), 
      loss = 'categorical_crossentropy', metrics = ['accuracy'])

perf_lr_scheduler = ReduceLROnPlateau(monitor = 'val_loss', factor = 0.9, patience = 3,
        verbose = 1, min_delta = 0.01, min_lr = 0.000001)

model_earlystop = EarlyStopping(monitor = 'val_loss', min_delta = 0.001, patience = 10, restore_best_weights = True) 

#Convert labels to binary matrics
img_aug_label = to_categorical(img_aug_label, num_classes = no_of_img)

#Convert images to float to between 0 and 1
img_aug = np.float32(img_aug)/255

plt.imshow(img_aug[0,:,:,0])
plt.show()

#Train on augmented images
model.fit(
  img_aug, 
  img_aug_label, 
  batch_size = 4,
  epochs = 100, 
  validation_split = 0.2,
  shuffle = True,
  callbacks = [perf_lr_scheduler], 
  verbose = 2)

The output of my model is as shown below:

Train on 2400 samples, validate on 600 samples
Epoch 1/100
2400/2400 - 12s - loss: 0.6474 - accuracy: 0.8071 - val_loss: 9.8161 - val_accuracy: 0.0000e+00
Epoch 2/100
2400/2400 - 10s - loss: 0.0306 - accuracy: 0.9921 - val_loss: 10.1733 - val_accuracy: 0.0000e+00
Epoch 3/100
2400/2400 - 10s - loss: 0.0058 - accuracy: 0.9996 - val_loss: 10.9820 - val_accuracy: 0.0000e+00
Epoch 4/100

Epoch 00004: ReduceLROnPlateau reducing learning rate to 0.0009000000427477062.
2400/2400 - 10s - loss: 0.0019 - accuracy: 1.0000 - val_loss: 11.3029 - val_accuracy: 0.0000e+00
Epoch 5/100
2400/2400 - 10s - loss: 0.0042 - accuracy: 0.9992 - val_loss: 11.9037 - val_accuracy: 0.0000e+00
Epoch 6/100
2400/2400 - 10s - loss: 0.0024 - accuracy: 0.9996 - val_loss: 11.5218 - val_accuracy: 0.0000e+00
Epoch 7/100

Epoch 00007: ReduceLROnPlateau reducing learning rate to 0.0008100000384729356.
2400/2400 - 10s - loss: 9.9053e-04 - accuracy: 1.0000 - val_loss: 11.7658 - val_accuracy: 0.0000e+00
Epoch 8/100
2400/2400 - 10s - loss: 0.0011 - accuracy: 1.0000 - val_loss: 12.0437 - val_accuracy: 0.0000e+00
Epoch 9/100

I realised the error occured as I had not shuffled the data manually before using it as training data for the model. I thought the validation_split and shuffle arguments would only occur during training time, but in fact this happened before training time. In other words, the fit function will split your data into training and validation sets first, before shuffling the data in each set (but not across).

For my augmented dataset, the split had occurred in a position where the validation set contained types of images not found in the training set. Consequently, the model was performing validation on types of data that it had not seen in the training set, resulting in the poor validation loss and accuracy. Manually shuffling the data before the fitting them into the model solved this problem.

Recommended topics

Hot tags