What does axis=[1,2,3] mean in K.sum in keras backend?
Asked Answered



I'm trying to implement a custom loss function for my CNN model. I found an IPython notebook that has implemented a custom loss function named Dice, just as follows:

from keras import backend as K
smooth = 1.

def dice_coef(y_true, y_pred, smooth=1):
    intersection = K.sum(y_true * y_pred, axis=[1,2,3])
    union = K.sum(y_true, axis=[1,2,3]) + K.sum(y_pred, axis=[1,2,3])
    return K.mean( (2. * intersection + smooth) / (union + smooth), axis=0)

def bce_dice(y_true, y_pred):
    return binary_crossentropy(y_true, y_pred)-K.log(dice_coef(y_true, y_pred))

def true_positive_rate(y_true, y_pred):
    return K.sum(K.flatten(y_true)*K.flatten(K.round(y_pred)))/K.sum(y_true)

seg_model.compile(optimizer = 'adam', 
              loss = bce_dice, 
              metrics = ['binary_accuracy', dice_coef, true_positive_rate])

I have never used keras backend before and really get confused with the matrix calculations of keras backend. So, I created some tensors to see what's happening in the code:

val1 = np.arange(24).reshape((4, 6))
y_true = K.variable(value=val1)

val2 = np.arange(10,34).reshape((4, 6))
y_pred = K.variable(value=val2)

Now I run the dice_coef function:

result = K.eval(dice_coef(y_true=y_true, y_pred=y_pred))
print('result is:', result)

But it gives me this error:

ValueError: Invalid reduction dimension 2 for input with 2 dimensions. for 'Sum_32' (op: 'Sum') with input shapes: [4,6], [3] and with computed input tensors: input[1] = <1 2 3>.

Then I changed all of [1,2,3] to -1 just like below:

def dice_coef(y_true, y_pred, smooth=1):
    intersection = K.sum(y_true * y_pred, axis=-1)
    # intersection = K.sum(y_true * y_pred, axis=[1,2,3])
    # union = K.sum(y_true, axis=[1,2,3]) + K.sum(y_pred, axis=[1,2,3])
    union = K.sum(y_true, axis=-1) + K.sum(y_pred, axis=-1)
    return K.mean( (2. * intersection + smooth) / (union + smooth), axis=0)

Now it gives me a value.

result is: 14.7911625


  1. What is [1,2,3]?
  2. Why the code works when I change [1,2,3] to -1?
  3. What does this dice_coef function do?
Quackenbush answered 10/1, 2019 at 10:18 Comment(2)
[1,2,3] means that the sum (seens as an aggregation operation) runs through the second to fourth axes of the tensor.Babita
Likewise axis=-1 in sum means that the sum runs over the last axis.Babita

Just like in numpy, you can define the axis along you want to perform a certain operation. For example, for a 4d array, we can sum along a specific axis like this

>>> a = np.arange(150).reshape((2, 3, 5, 5))
>>> a.sum(axis=0).shape
(3, 5, 5)
>>> a.sum(axis=0, keepdims=True).shape
(1, 3, 5, 5)
>>> a.sum(axis=1, keepdims=True).shape
(2, 1, 5, 5)

If we feed a tuple, we can perform this operation along multiple axes.

>>> a.sum(axis=(1, 2, 3), keepdims=True).shape
(2, 1, 1, 1)

If the argument is -1, it defaults to performing the operation over the last axis, regardless of how many there are.

>>> a.sum(axis=-1, keepdims=True).shape
(2, 3, 5, 1)

This should have clarified points 1 and 2. Since the axis argument is (1, 2, 3), you need a minimum of 4 axes for the operation to be valid. Try changing your variables to something like val1 = np.arange(24).reshape((2, 2, 2, 3)) and it all works.

The model seems to calculate the Binary Cross Entropy Dice loss and dice_coeff(), as the name suggests, calculates the Dice coefficient. I'm not sure what the purpose of smooth is, but if it was for the purpose of avoiding divisions by 0, you'd expect a small number, like 1e-6.

Embryology answered 10/1, 2019 at 11:13 Comment(0)

What is [1,2,3]?

These numbers specify which dimension we want to do the summation. The smallest number shows the outer dimension and the biggest shows the inner. See the example:

import tensorflow as tf


    a = tf.constant([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

    print(tf.reduce_sum(a, axis=2).numpy())
    #[[ 3  7]
    # [11 15]]
    print(tf.reduce_sum(a, axis=1).numpy())
    #[[ 4  6]
    # [12 14]]
    print(tf.reduce_sum(a, axis=0).numpy())
    #[[ 6  8]
    # [10 12]]

In the above example, axis = 2 means, inner entries which are: [1,2] , [3,4], [5,6], and [7,8]. As a result, after summation, we have the tensor: [[3, 7], [11, 15]]. The same idea applies to other axes.

Why the code works when I change [1,2,3] to -1

When we did not specify any axis or on the other hand specify all axis means that we sum over all tensor elements. This result our tensor converted to a single scalar. See example:

a = tf.constant([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(tf.reduce_sum(a).numpy()) # 36
print(tf.reduce_sum(a, axis=[0,1,2])) # 36

If we have 3 dimension [0, 1, 2], axis = -1 is equal to axis = 2. See here for complete tutorial on python indexing.

What does this dice_coef function do?

enter image description here

See here for a complete explanation about dice_coef.

Heligoland answered 10/1, 2019 at 11:15 Comment(0)

First two parts of your question are already explained. I'll explain the last one.

  • What does this dice_coef function do?

dice_coef function is calculation dice similarity coefficient which is a loss function. I'll explain context of image segmentation tasks so, dice similarity coefficient is a measure of how well two contours overlaps.

img Source: Coursera

It's value ranges from 0 to 1.

  • 0 means complete mismatch i.e they don't overlap at all
  • 1 means perfect match i.e they completely overlap

In general, for two sets A and B, the Dice similarity coefficient is defined as:

DSC(A,B)=( 2 × |A ∩ B| ) / ( |A|+|B| ).

To avoid division by zero ϵ is added which is a small number

DSC(A,B)= ( 2 × |A ∩ B| + ϵ ) / ( |A| + |B|+ ϵ ).

In your case:

  • A is y_true

  • B is y_pred

  • ϵ is smooth

Iasis answered 14/12, 2021 at 7:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.