Create a weighted MSE loss function in Tensorflow

Asked 7/5, 2021 at 15:21 Answered 9/8, 2022 at 7:17

I want to train a recurrent neural network using Tensorflow. My model outputs a 1 by 100 vector for each training sample. Assume that y = [y_1, y_2, ..., y_100] is my output for training sample x and the expected output is y'= [y'_1, y'_2, ..., y'_100].

I wish to write a custom loss function that calculates the loss of this specific sample as follows:

Loss =  1/sum(weights) * sqrt(w_1*(y_1-y'_1)^2 + ... + w_100*(y_100-y'_100)^2)

which weights = [w_1,...,w_100] is a given weight array.

Could someone help me with implementing such a custom loss function? (I also use mini-batches while training)

Porphyria answered 7/5, 2021 at 15:21 Comment(0)

I want to underline that you have two possibilities according to your problem:

[1] If the weights are equal for all your samples:

You can build a loss wrapper. Here a dummy example:

n_sample = 200
X = np.random.uniform(0,1, (n_sample,10))
y = np.random.uniform(0,1, (n_sample,100))
W = np.random.uniform(0,1, (100,)).astype('float32')

def custom_loss_wrapper(weights):
    def loss(true, pred):
        sum_weights = tf.reduce_sum(weights) * tf.cast(tf.shape(pred)[0], tf.float32)
        resid = tf.sqrt(tf.reduce_sum(weights * tf.square(true - pred)))
        return resid/sum_weights
    return loss

inp = Input((10,))
x = Dense(256)(inp)
pred = Dense(100)(x)

model = Model(inp, pred)
model.compile('adam', loss=custom_loss_wrapper(W))

model.fit(X, y, epochs=3)

[2] If the weights are different between samples:

You should build your model using add_loss in order to dynamically take into account the weights for each sample. Here a dummy example:

n_sample = 200
X = np.random.uniform(0,1, (n_sample,10))
y = np.random.uniform(0,1, (n_sample,100))
W = np.random.uniform(0,1, (n_sample,100))

def custom_loss(true, pred, weights):
    sum_weights = tf.reduce_sum(weights)
    resid = tf.sqrt(tf.reduce_sum(weights * tf.square(true - pred)))
    return resid/sum_weights

inp = Input((10,))
true = Input((100,))
weights = Input((100,))
x = Dense(256)(inp)
pred = Dense(100)(x)

model = Model([inp,true,weights], pred)
model.add_loss(custom_loss(true, pred, weights))
model.compile('adam', loss=None)

model.fit([X,y,W], y=None, epochs=3)

When using add_loss you should pass all the tensors involved in the loss as input layers and pass them inside the loss for the computation.

At inference time you can compute predictions as always, simply removing the true and weights as input:

final_model = Model(model.input[0], model.output)
final_model.predict(X)

Stonyhearted answered 10/5, 2021 at 17:3 Comment(0)

You can implement custom weighted mse in the following way

import numpy as np 
from tensorflow.keras import backend as K 

def custom_mse(class_weights):
    def weighted_mse(gt, pred):
        # Formula: 
        # w_1*(y_1-y'_1)^2 + ... + w_100*(y_100-y'_100)^2 / sum(weights)
        return K.sum(class_weights * K.square(gt - pred)) / K.sum(class_weights)
    return weighted_mse

y_true  = np.array([[0., 1., 1, 0.], [0., 0., 1., 1.]])
y_pred  = np.array([[0., 1, 0., 1.], [1., 0., 1., 1.]])
weights = np.array([0.25, 0.50, 1., 0.75])

print(y_true.shape, y_pred.shape, weights.shape)
(2, 4) (2, 4) (4,)

loss = custom_mse(class_weights=weights)
loss(y_true, y_pred).numpy()
0.8

Using it with model compilation.

model.compile(loss=custom_mse(weights))

This will compute mse with the provided weighted matrices. However, in your question, you quote sqrt..., from which I presume you meant root mse (rmse). To do that you can use K.sqrt(K.sum(...)) / K.sum(...) in the custom function of custom_mse.

FYI, you may also interest to look at class_weights and sample_weights during Model. fit. From source:

class_weight: Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class.

sample_weight: Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. This argument is not supported when x is a dataset, generator, or keras.utils.Sequence instance, instead provides the sample_weights as the third element of x.

And also loss_weights in Model.compile, from source

loss_weights: Optional list or dictionary specifying scalar coefficients (Python floats) to weight the loss contributions of different model outputs. The loss value that will be minimized by the model will then be the weighted sum of all individual losses, weighted by the loss_weights coefficients. If a list, it is expected to have a 1:1 mapping to the model's outputs. If a dict, it is expected to map output names (strings) to scalar coefficients.

Thagard answered 10/5, 2021 at 6:45 Comment(0)

A class version of the weighted mean squared error loss function.

class WeightedMSE(object):
    def __init__(self):
        pass

    def __call__(self, y_true, y_pred, weights):
        sum_weights = tf.reduce_sum(weights)
        resid = tf.reduce_sum(weights * tf.square(y_true - y_pred))
        return resid / sum_weights

Plagio answered 9/8, 2022 at 7:17 Comment(0)

Recommended topics

Hot tags