Keras apply different weight to different misclassification

Asked 21/6, 2019 at 2:20 Answered 12/9, 2019 at 12:30

tensorflow keras loss-function cross-entropy

I am trying to implement a classification problem with three classes: 'A','B' and 'C', where I would like to incorporate penalty for different type of misclassification in my model loss function (kind of like weighted cross entropy). Class weight is not suited as it applies-to all data that belongs to the class. Eg True label 'B' getting misclassified as 'C' should have higher loss as compared to getting misclassified as 'A'. Weight table as follow:

   A  B  C  
A  1  1  1  
B  1  1  1.2 
C  1  1  1

In current categorical_crossentropy loss, for true class 'B' if I have prediction softmax as

0.5 0.4 0.1  vs 0.1 0.4 0.5

categorical_crossentropy will be same. It doesn't matter if 'B' is getting miss-classified as A or C. I want to increase the loss of second prediction softmax as compared to first one.

I have tried https://github.com/keras-team/keras/issues/2115 but none of the code is working for Keras v2. Any help where I can directly enforce the weight matrix into Keras loss function will be highly appreciated.

Eterne answered 21/6, 2019 at 2:20 Comment(0)

Building on issue #2115, I've coded the following solution and posted it there too.
I only tested it in Tensorflow 1.14, so I guess it should work with Keras v2.

Adding to the class solution here in #2115 (comment) here's a more robust and vectorized implementation:

import tensorflow.keras.backend as K
from tensorflow.keras.losses import CategoricalCrossentropy


class WeightedCategoricalCrossentropy(CategoricalCrossentropy):

    def __init__(self, cost_mat, name='weighted_categorical_crossentropy', **kwargs):
        assert(cost_mat.ndim == 2)
        assert(cost_mat.shape[0] == cost_mat.shape[1])

        super().__init__(name=name, **kwargs)
        self.cost_mat = K.cast_to_floatx(cost_mat)

    def __call__(self, y_true, y_pred):

        return super().__call__(
            y_true=y_true,
            y_pred=y_pred,
            sample_weight=get_sample_weights(y_true, y_pred, self.cost_mat),
        )


def get_sample_weights(y_true, y_pred, cost_m):
    num_classes = len(cost_m)

    y_pred.shape.assert_has_rank(2)
    y_pred.shape[1].assert_is_compatible_with(num_classes)
    y_pred.shape.assert_is_compatible_with(y_true.shape)

    y_pred = K.one_hot(K.argmax(y_pred), num_classes)

    y_true_nk1 = K.expand_dims(y_true, 2)
    y_pred_n1k = K.expand_dims(y_pred, 1)
    cost_m_1kk = K.expand_dims(cost_m, 0)

    sample_weights_nkk = cost_m_1kk * y_true_nk1 * y_pred_n1k
    sample_weights_n = K.sum(sample_weights_nkk, axis=[1, 2])

    return sample_weights_n

Usage:

model.compile(loss=WeightedCategoricalCrossentropy(cost_matrix), ...)

Similarly, this can be applied for the CategoricalAccuracy metric too:

from tensorflow.keras.metrics import CategoricalAccuracy


class WeightedCategoricalAccuracy(CategoricalAccuracy):

    def __init__(self, cost_mat, name='weighted_categorical_accuracy', **kwargs):
        assert(cost_mat.ndim == 2)
        assert(cost_mat.shape[0] == cost_mat.shape[1])

        super().__init__(name=name, **kwargs)
        self.cost_mat = K.cast_to_floatx(cost_mat)

    def update_state(self, y_true, y_pred, sample_weight=None):

        return super().update_state(
            y_true=y_true,
            y_pred=y_pred,
            sample_weight=get_sample_weights(y_true, y_pred, self.cost_mat),
        )

Usage:

model.compile(metrics=[WeightedCategoricalAccuracy(cost_matrix), ...], ...)

Dome answered 12/9, 2019 at 12:30 Comment(1)

How would this look for SparseCategoricalCrossentropy? – Socle 17/11, 2021 at 19:38

You could change the loss function for something that multiplies the loss values by the appropriate weighting in your matrix.

So, by way of an example, consider the mnist tensorflow example:

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

if we wanted to change this in order to weight the losses based on the following matrix:

weights  = tf.constant([
       [1., 1.2, 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1.2, 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 10.9, 1.2, 1., 1., 1., 1., 1., 1.],
       [1., 0.9, 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

then we could wrap the existing sparse_categorical_crossentropy in a new custom loss function that multiplies the loss by the appropriate weighting. Something like this:

def custom_loss(y_true, y_pred):
  # get the prediction from the final softmax layer:
  pred_idx = tf.argmax(y_pred, axis=1, output_type=tf.int32)

  # stack these so we have a tensor of [[predicted_i, actual_i], ...,] for each i in batch
  indices = tf.stack([tf.reshape(pred_idx, (-1,)), 
                       tf.reshape(tf.cast( y_true, tf.int32), (-1,))
                     ], axis=1)

  # use tf.gather_nd() to convert indices to the appropriate weight from our matrix [w_i, ...] for each i in batch
  batch_weights = tf.gather_nd(weights, indices)


  return batch_weights * tf.keras.losses.sparse_categorical_crossentropy(y_true, y_pred)

We can then use this new custom loss function in the model:

model.compile(optimizer='adam',
              loss=custom_loss,
              metrics=['accuracy'])

Redstart answered 23/6, 2019 at 11:16 Comment(0)

Recommended topics

Hot tags