PyTorch LogSoftmax vs Softmax for CrossEntropyLoss

About

Asked 8/12, 2020 at 3:0 Answered 8/12, 2020 at 4:48

I understand that PyTorch's LogSoftmax function is basically just a more numerically stable way to compute Log(Softmax(x)). Softmax lets you convert the output from a Linear layer into a categorical probability distribution.

The pytorch documentation says that CrossEntropyLoss combines nn.LogSoftmax() and nn.NLLLoss() in one single class.

Looking at NLLLoss, I'm still confused...Are there 2 logs being used? I think of negative log as information content of an event. (As in entropy)

After a bit more looking, I think that NLLLoss assumes that you're actually passing in log probabilities instead of just probabilities. Is this correct? It's kind of weird if so...

Desegregate answered 8/12, 2020 at 3:0 Comment(0)

Yes, NLLLoss takes log-probabilities (log(softmax(x))) as input. Why?. Because if you add a nn.LogSoftmax (or F.log_softmax) as the final layer of your model's output, you can easily get the probabilities using torch.exp(output), and in order to get cross-entropy loss, you can directly use nn.NLLLoss. Of course, log-softmax is more stable as you said.

And, there is only one log (it's in nn.LogSoftmax). There is no log in nn.NLLLoss.

nn.CrossEntropyLoss() combines nn.LogSoftmax() (that is, log(softmax(x))) and nn.NLLLoss() in one single class. Therefore, the output from the network that is passed into nn.CrossEntropyLoss needs to be the raw output of the network (called logits), not the output of the softmax function.

Sallyanne answered 8/12, 2020 at 4:48 Comment(3)

Given that the network gives raw output to the Cross Entropy Loss during training, do I need to add a LogSoftMax layer during inference? – Terrorize 8/6, 2021 at 12:5

Not necessarily, if you want to get the probability, then only you'd want to use Softmax, provided your model gives logits. – Sallyanne 12/9, 2021 at 14:53

The NLLoss() not having a log is a PyTorch quirk. The definition of negative log likelihood has a log in it: notesbylex.com/negative-log-likelihood.html. The OP is right to be confused. – Rosana 27/10, 2022 at 4:6

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags