Multivariate Regression Neural Network Loss Function
Asked Answered
A

1

12

I am doing multivariate regression with a fully connected multilayer neural network in Tensorflow. The network predicts 2 continuous float variables (y1,y2) given an input vector (x1,x2,...xN), i.e. the network has 2 output nodes. With 2 outputs the network does not seem to converge. My loss function is essentially the L2 distance between the prediction and truth vectors (each contains 2 scalars):

loss = tf.nn.l2_loss(tf.sub(prediction, truthValues_placeholder)) + L2regularizationLoss 

I am using L2 regularization, dropout regularization, and my activation functions are tanh.

My questions: Is L2 distance the proper way to calculate loss for a multivariate network output? Are there some tricks needed to get multivariate regression networks to converge (as opposed to single-variable networks and classifiers)?

Aphanite answered 17/7, 2016 at 22:41 Comment(0)
D
3

Yes, you can use L2 distance for multivariate regression. But I would recommended experimenting with an absolute L1 distance as well.

One of the problems with L2 is its susceptibility to outliers, and with L1 is the non-smooth nature at the origin.

You can fix both of these issues using the Huber Loss, which acts like L2 near the origin, and like absolute L1 as you move away from the origin.

Dudley answered 14/4, 2018 at 12:51 Comment(1)
I'd also ask myself if L2 distance in (y1,y2) space is meaningful in your particular problem. You could, for example use mean square error on each output individually. L1 loss already treats y1 and y2 independently. I have generally found L1 loss to result in better models with the multivariate regression problems I deal with.Aniconic

© 2022 - 2024 — McMap. All rights reserved.