Tensorflow, Keras: In a multi-class classification, accuracy is high, but precision, recall, and f1-score is zero for most classes

H

2

7

General Explanation: My codes work fine, but the results are wired. I don't know the problem is with

the network structure,
or the way I feed the data to the network,
or anything else.

I am struggling with this error several weeks and so far I have changed the loss function, optimizer, data generator, etc., but I could not solve it. I appreciate any help. If the following information is not enough, let me know, please.

Field of study: I am using tensorflow, keras for multiclass classification. The dataset has 36 binary human attributes. I have used resnet50, then for each part of the body (head, upper body, lower body, shoes, accessories), I have added a separated branch to the network. The network has 1 input image with 36 labels and 36 output nodes (36 denes layers with sigmoid activation).

Problem: The problem is that the accuracy that keras is reporting is high, but f1-score is very low or zero for most of the outputs (even when I use f1-score as a metric when compiling the network, the f1-socre for validation is very bad).

aAfter train, when I use the network in prediction mode, it returns always one/zero for some classes. It means that the network is not able to learn (even when I use weighted loss function or focal loss function.)

Why it is weird? Because, state-of-the-art methods report heigh f1 score even after the first epoch (e.g. https://github.com/chufengt/iccv19_attribute, that I have run it in my PC and got good results after one epoch).

Parts of the Codes:

        print("setup model ...")
        input_image = KL.Input(args.img_input_shape, name= "input_1")
        C1, C2, C3, C4, C5 = resnet_graph(input_image, architecture="resnet50", stage5=False, train_bn=True)
        output_layers = merged_model (input_features=C4)
        model = Model(inputs=input_image, outputs=output_layers, name='SoftBiometrics_Model')

...

        print("model compiling ...")
        OPTIM = optimizers.Adadelta(lr=args.learning_rate, rho=0.95)
        model.compile(optimizer=OPTIM, loss=binary_focal_loss(alpha=.25, gamma=2), metrics=['acc',get_f1])
        plot_model(model, to_file='model.png')

...

        img_datagen = ImageDataGenerator(rotation_range=6, width_shift_range=0.03, height_shift_range=0.03, brightness_range=[0.85,1.15], shear_range=0.06, zoom_range=0.09, horizontal_flip=True, preprocessing_function=preprocess_input_resnet, rescale=1/255.)
        img_datagen_test = ImageDataGenerator(preprocessing_function=preprocess_input_resnet, rescale=1/255.)

        def multiple_outputs(generator, dataframe, batch_size, x_col):
          Gen = generator.flow_from_dataframe(dataframe=dataframe,
                                               directory=None,
                                               x_col = x_col,
                                               y_col = args.Categories,
                                               target_size = (args.img_input_shape[0],args.img_input_shape[1]),
                                               class_mode = "multi_output",
                                               classes=None,
                                               batch_size = batch_size,
                                               shuffle = True)
          while True:
            gnext = Gen.next()
            # return image batch and 36 sets of lables
            labels = gnext[1]
            output_dict = {"{}_output".format(Category): np.array(labels[index]) for index, Category in enumerate(args.Categories)}
            yield {'input_1':gnext[0]}, output_dict

    trainGen = multiple_outputs (generator = img_datagen, dataframe=Train_df_img, batch_size=args.BATCH_SIZE, x_col="Train_Filenames")
    testGen = multiple_outputs (generator = img_datagen_test, dataframe=Test_df_img, batch_size=args.BATCH_SIZE, x_col="Test_Filenames")

    STEP_SIZE_TRAIN = len(Train_df_img["Train_Filenames"]) // args.BATCH_SIZE
    STEP_SIZE_VALID = len(Test_df_img["Test_Filenames"]) // args.BATCH_SIZE

    ...

    print("Fitting the model to the data ...")
            history = model.fit_generator(generator=trainGen,
                                         epochs=args.Number_of_epochs,
                                         steps_per_epoch=STEP_SIZE_TRAIN,
                                         validation_data=testGen,
                                         validation_steps=STEP_SIZE_VALID,
                                         callbacks= [chekpont],
                                         verbose=1)

Homunculus answered 18/3, 2020 at 11:24 Comment(1)

Is your dataset imbalanced? – Ilanailangilang 30/3, 2020 at 18:34

B

5

There is a possibility that you are passing binary f1-score to compile function. This should fix the problem -

pip install tensorflow-addons

...

import tensorflow_addons as tfa 

f1 = tfa.metrics.F1Score(36,'micro' or 'macro')

model.compile(...,metrics=[f1])

You can read more about how f1-micro and f1-macro is calculated and which can be useful here.

Berryman answered 20/3, 2020 at 4:23 Comment(1)

unfortunately tfa is being discontinued and it looks like thid didn't make it into keras.. – Dockery 1/3 at 14:31

F

0

Somehow, the predict_generator() of Keras' model does not work as expected. I would rather loop through all test images one-by-one and get the prediction for each image in each iteration. I am using Plaid-ML Keras as my backend and to get prediction I am using the following code.

import os
from PIL import Image
import keras
import numpy

print("Prediction result:")
dir = "/path/to/test/images"
files = os.listdir(dir)
correct = 0
total = 0
#dictionary to label all traffic signs class.
classes = {
    0:'This is Cat',
    1:'This is Dog',
}
for file_name in files:
    total += 1
    image = Image.open(dir + "/" + file_name).convert('RGB')
    image = image.resize((100,100))
    image = numpy.expand_dims(image, axis=0)
    image = numpy.array(image)
    image = image/255
    pred = model.predict_classes([image])[0]
    sign = classes[pred]
    if ("cat" in file_name) and ("cat" in sign):
        print(correct,". ", file_name, sign)
        correct+=1
    elif ("dog" in file_name) and ("dog" in sign):
        print(correct,". ", file_name, sign)
        correct+=1
print("accuracy: ", (correct/total))

Fluorescence answered 17/11, 2022 at 16:38 Comment(0)

Recommended topics

Hot tags