tensorflow - softmax ignore negative labels (just like caffe) [duplicate]
Asked Answered
P

2

3

In Caffe, there is an option with its SoftmaxWithLoss function to ignore all negative labels (-1) in computing probabilities, so that only 0 or positive label probabilities add up to 1.

Is there a similar feature with Tensorflow softmax loss?

Precambrian answered 23/8, 2016 at 2:32 Comment(0)
P
5

Just came up with a work-around --- I created a one-hot tensor on the label indices using tf.one_hot (with the depth set at the # of labels). tf.one_hot automatically zeros out all indices with -1 in the resulting one_hot tensor (of shape [batch, # of labels])

This enables softmax loss (i.e. tf.nn.softmax_cross_entropy_with_logits) to "ignore" all -1 labels.

Precambrian answered 24/8, 2016 at 2:46 Comment(1)
I don't think this solution is useful. I initially thought this worked, as it showed loss contribution from the unlabeled samples as zero. But I observed that the solution actually made the training unstable (loss shot through the roof on the second epoch). When I manually removed the unlabeled samples from the training batch, things stabilized again.Namesake
C
2

I am not quite sure that your workaround is actually working.

Caffe's ignore_label in caffe semantically has to be considered as "label of a sample which has to be ignored", thus it has as an effect that the gradient for that sampl_e is not backpropagated, which is in no way guranteed by the use of a one hot vector.

On one hand, I expect any meaningful model to quickly learn to predict a zero value, or small enough value, for that specific entry, cause of the fact all samples will have a zero in that specific entry, so to say, backpropagated info due to errors in that prediction will vanish relativly fast.

On the other hand you need to be aware that, from a math point of view caffe's ignore_label and what you are doing are totally different.

Said this, I am new to TF and need the exact same feature as caffe's ignore_label.

Caloric answered 16/11, 2016 at 8:35 Comment(1)
I found another "work-around", and this might work since it does not artificially zero-out -1 indices, but gathers only those that are not indexed -1: cls_score_x = tf.reshape(tf.gather(cls_soft,tf.where(tf.not_equal(labels_ind,-1))),[-1, 2]) label_x = tf.reshape(tf.gather(labels_ind, tf.where(tf.not_equal(labels_ind,-1))),[-1]) loss_cls = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(cls_score_x, label_x)), where cls_score_x and label_x are the filtered softmax probs (cls_score) and labels (labels_ind) without a -1Precambrian

© 2022 - 2024 — McMap. All rights reserved.