What is y_true and y_pred when creating a custom metric in Keras?
Asked Answered
D

3

39

I want to implement my custom metric in Keras. According to the documentation, my custom metric should be defined as a function that takes as input two tensors, y_pred and y_true, and returns a single tensor value.

However, I'm confused to what exactly will be contained in these tensors y_pred and y_true when the optimization is running. Is it just one data point? Is it the whole batch? The whole epoch (probably not)? Is there a way to obtain these tensors' shapes?

Can someone point to a trustworthy place where I can get this information? Any help would be appreciated. Not sure if relevant, but I'm using TensorFlow backend.


Things I tried so far, in order to answer this:

  • Checking the Keras metrics documentation (no explanation there about what these tensors are).
  • Checking the source code for the Keras metrics and trying to understand these tensors by looking at the Keras implementation for other metrics (This seems to suggest that y_true and y_pred have the labels for an entire batch, but I'm not sure).
  • Reading these stackoverflow questions: 1, 2, 3, and others (none answer my question, most are centered on the OP not clearly understanding the difference between a tensor and the values computed using that tensor during the session).
  • Printing the values of y_true and y_pred during the optimization, by defining a metric like this:
    def test_metric(y_true, y_pred):
        y_true = K.print_tensor(y_true)
        y_pred = K.print_tensor(y_pred)
        return y_true - y_pred

(unfortunately these don't print anything during the optimization).

Digitize answered 10/10, 2017 at 9:18 Comment(3)
This one might help #43577422Weisler
@AlexOzerov: Thanks. I read it, but it's not clear to me how that helps, can you elaborate?Digitize
Thank you! I had exactly this question!Akers
M
43

y_true and y_pred

The tensor y_true is the true data (or target, ground truth) you pass to the fit method.
It's a conversion of the numpy array y_train into a tensor.

The tensor y_pred is the data predicted (calculated, output) by your model.

Usually, both y_true and y_pred have exactly the same shape. A few of the losses, such as the sparse ones, may accept them with different shapes.


The shape of y_true

It contains an entire batch. Its first dimension is always the batch size, and it must exist, even if the batch has only one element.

Two very easy ways to find the shape of y_true are:

  • check your true/target data: print(Y_train.shape)
  • check your model.summary() and see the last output

But its first dimension will be the batch size.

So, if your last layer outputs (None, 1), the shape of y_true is (batch, 1). If the last layer outputs (None, 200,200, 3), then y_true will be (batch, 200,200,3).


Custom metrics and loss functions

Unfotunately, printing custom metrics will not reveal their content (unless you are using eager mode on, and you have calculated every step of the model with data).
You can see their shapes with print(K.int_shape(y_pred)), for instance.

Remember that these libraries first "compile a graph", then later "runs it with data". When you define your loss, you're in the compile phase, and asking for data needs the model to run.

But even if the result of your metric is multidimensional, keras will automatically find ways to output a single scalar for that metric. (Not sure what is the operation, but very probably a K.mean() hidden under the table - it's interesting to return the entire batch, so Keras applies other operations such as sample weights, for instance).

Sources. After you get used to keras, this understanding gets natural from simply reading this part:

y_true: True labels. Theano/TensorFlow tensor.
y_pred: Predictions. Theano/TensorFlow tensor of the same shape as y_true.

True labels mean true/target data. Labels is a badly chosen word here, it is only really "labels" in classification models.
Predictions mean the results of your model.

Matrilateral answered 10/10, 2017 at 12:52 Comment(11)
I have a question regarding y_true. My training data (numpy array) has shape (100,). However, inside a metric, e.g. accuracy it has shape (TensorShape([Dimension(None), Dimension(None)]). Then, in the keras accuracy metric they compute K.max(y_true, axis=-1). What is the second dimension? Why do they take the argmax over this dimension instead of the first one?Carob
If "yTrain". Is (100,), it probably changed it to (100,1). This accuracy metric supposes you are using one hot classes.Cornwall
Okay. So when not using one hot classes I would have to change the accuracy computation to K.max(y_true, axis=0)?Carob
We need to understand what your data is to answer that. Is it a binary (0 or 1) result? If so, you can use binary_crossentropy as loss function, and keras will automatically use a suited accuracy for that, based on K.round(y_pred). - github.com/fchollet/keras/blob/master/keras/metrics.pyCornwall
Can anyone tell us how to implement custom metrics that calculate mean(y_pred - y_true)? I just want average value of the difference between predicted value and true valueSaddlebag
Use metrics=['mae'] (Mean absolute error), or use def metr(true, pred): return K.mean(pred-true) with metrics=[metr]Cornwall
@DanielMöller I want to be sure if y_pred has the same shape as y_true from the last section of your comment. E.g. if my model output y_pred has shape [None, seq_length, feature_size] then y_true is also a 3-D tensor (verified) though I pass only 2-D tensor in fit method. So the last comment should be read as y_true has shape same as y_pred.Grady
"Both y_true and y_pred have exactly the same shape, always." Across all dimensions? E.g., in this question (https://mcmap.net/q/409970/-keras-model-using-tensorflow-distribution-for-loss-fails-with-batch-size-gt-1/829332), y_pred is (None, 6) but (I assumed, perhaps wrongly) y_true is (None, 1).Shaynashayne
Well, therer are losses that accept a different y_true, especially the "sparse" types. The usual is an exact shape, though.Cornwall
@DanielMöller. Q1. I am passing y_true as 2 D array and y_pred is a 3 D array. But, within custom loss, y_true becomes 3 D rather than y_pred becoming 2 D which is contrary to your last comment. (y_pred's shape should change to y_trues shape). Why is this? The point is should y_true shape change to y_pred shape or vice-versa to match the shape?Grady
If it's a custom loss, you should control it the way you want.Cornwall
F
0

y_true is the true value (labels). and y_pred is values which your NN model predicted.

The size (shape) of the tensors is determent by size of the batches (nb_batches).

Fissionable answered 10/10, 2017 at 11:13 Comment(2)
Could elaborate your answer a bit more? Let's say that the output of my classifier network is N-dimensional (i.e. pmf for N classes), and my batch size is B. Then the shape of, e.g, y_true, would be (N,B) or (B,N)? Or something else?Digitize
Also, can you point to any references that support your statement?Digitize
P
-1

y_true is the target values and y_pred is the predicted value from the model. The parameter position in the function is also important. You can check by implementing only one example and you can observe by using the function as metrics. Note:- While checking this property avoid using validation split since there are not enough examples for a split to happen and also avoid scaling the examples for better visualizations

Pokelogan answered 18/5, 2020 at 5:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.