PyTorch equivalence for softmax_cross_entropy_with_logits

Asked 14/9, 2017 at 12:0 Answered 22/9, 2020 at 17:9

I was wondering is there an equivalent PyTorch loss function for TensorFlow's softmax_cross_entropy_with_logits?

Poultry answered 14/9, 2017 at 12:0 Comment(5)

maybe torch.nn.CrossEntropyLoss ? – Pau 14/9, 2017 at 14:12

@YaroslavBulatov thanks for your reply! tf.nn.softmax_cross_entropy_with_logits requires that logits and labels must have the same shape, whereas torch.nn.CrossEntropyLoss has Input: (N,C) where C = number of classes; Target: (N), where each value is 0 <= targets[i] <= C-1. In addition, the latter does not use Softmax in the calculation. I'm looking for an exact replica of the TensorFlow function. – Poultry 14/9, 2017 at 14:28

This has been discussed in the pytorch forum. Hope it helps – Scorpius 15/9, 2017 at 5:34

@Scorpius : thank you for your suggestion! That page didn't solve my problem, but it led to https://mcmap.net/q/49597/-what-are-logits-what-is-the-difference-between-softmax-and-softmax_cross_entropy_with_logits which did. Many thanks to stackoverflowuser2010. – Poultry 17/9, 2017 at 22:5

@Poultry So how did you solve the problem? Mind to share? – Trinatte 4/12, 2019 at 14:25

is there an equivalent PyTorch loss function for TensorFlow's softmax_cross_entropy_with_logits?

`torch.nn.functional.cross_entropy`

This takes logits as inputs (performing log_softmax internally). Here "logits" are just some values that are not probabilities (i.e. not necessarily in the interval [0,1]).

But, logits are also the values that will be converted to probabilities. If you consider the name of the tensorflow function you will understand it is pleonasm (since the with_logits part assumes softmax will be called).

In the PyTorch implementation looks like this:

loss = F.cross_entropy(x, target)

Which is equivalent to :

lp = F.log_softmax(x, dim=-1)
loss = F.nll_loss(lp, target)

It is not F.binary_cross_entropy_with_logits because this function assumes multi label classification:

F.sigmoid + F.binary_cross_entropy = F.binary_cross_entropy_with_logits

It is not torch.nn.functional.nll_loss either because this function takes log-probabilities (after log_softmax()) not logits.

Clematis answered 22/9, 2020 at 17:9 Comment(1)

This is the right answer. Documentation pytorch.org/docs/stable/generated/… shows that "input (Tensor) – Predicted unnormalized scores (often referred to as logits)." – Nixon 31/5, 2022 at 22:59

A solution

from thexp.calculate.tensor import onehot
from torch.nn import functional as F
import torch

logits = torch.rand([3,10])
ys = torch.tensor([1,2,3])
targets = onehot(ys,10)
assert F.cross_entropy(logits,ys) == -torch.mean(torch.sum(F.log_softmax(logits, dim=1) * targets, dim=1))

onehot function:

def onehot(labels: torch.Tensor, label_num):
    return torch.zeros(labels.shape[0], label_num, device=labels.device).scatter_(1, labels.view(-1, 1), 1)

Bedwarmer answered 30/8, 2020 at 13:44 Comment(0)

Following the pointers in several threads, I ended up with the following conversion. I will put post my solution here in case anyone else falls to this thread. It is modified from here, and behaves as expected within this context.

# pred is the prediction with shape [C, H*W]
# gt is the target with shape [H*W]
# idx is the boolean array on H*W for masking

# Tensorflow version
loss = tf.nn.sparse_softmax_cross_entropy_with_logits( \
          logits=tf.boolean_mask(pred, idx), \
          labels=tf.boolean_mask(gt, idx)))

# Pytorch version       
logp = torch.nn.functional.log_softmax(pred[idx])
logpy = torch.gather(logp, 1, Variable(gt[idx].view(-1,1)))
loss = -(logpy).mean()

Ahmednagar answered 12/4, 2019 at 1:51 Comment(1)

What's the boolean mask for? – Trinatte 4/12, 2019 at 2:58

@Blade Here's the solution I came up with!

import torch
import torch.nn as nn
import torch.nn.functional as F


class masked_softmax_cross_entropy_loss(nn.Module):
    r"""my version of masked tf.nn.softmax_cross_entropy_with_logits"""
    def __init__(self, weight=None):
        super(masked_softmax_cross_entropy_loss, self).__init__()
        self.register_buffer('weight', weight)

    def forward(self, input, target, mask):
        if not target.is_same_size(input):
            raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))

        input = F.softmax(input)
        loss = -torch.sum(target * torch.log(input), 1)
        loss = torch.unsqueeze(loss, 1)
        mask /= torch.mean(mask)
        mask = torch.unsqueeze(mask, 1)
        loss = torch.mul(loss, mask)
        return torch.mean(loss)

Btw: I needed this loss function at the time (Sept 2017) because I was attempting to translate Thomas Kipf's GCN (see https://arxiv.org/abs/1609.02907) code from TensorFlow to PyTorch. However, I now notice that Kipf has done this himself (see https://github.com/tkipf/pygcn), and in his code, he simply uses the built-in PyTorch loss function, the negative log likelihood loss, i.e.

loss_train = F.nll_loss(output[idx_train], labels[idx_train])

Hope this helps.

~DV

Poultry answered 5/12, 2019 at 15:6 Comment(0)

`torch.nn.functional.cross_entropy`

Recommended topics

Hot tags