I am doing multivariate regression with a fully connected multilayer neural network in Tensorflow. The network predicts 2 continuous float variables (y1,y2)
given an input vector (x1,x2,...xN)
, i.e. the network has 2 output nodes. With 2 outputs the network does not seem to converge. My loss function is essentially the L2 distance between the prediction and truth vectors (each contains 2 scalars):
loss = tf.nn.l2_loss(tf.sub(prediction, truthValues_placeholder)) + L2regularizationLoss
I am using L2 regularization, dropout regularization, and my activation functions are tanh.
My questions: Is L2 distance the proper way to calculate loss for a multivariate network output? Are there some tricks needed to get multivariate regression networks to converge (as opposed to single-variable networks and classifiers)?