Multi-dimensional regression with Keras
Asked Answered
E

1

5

I want to use Keras to train a neural network for 2-dimensional regression.

My input is a single number, and my output has two numbers:

model = Sequential()
model.add(Dense(16, input_shape=(1,), kernel_initializer=initializers.constant(0.0), bias_initializer=initializers.constant(0.0)))
model.add(Activation('relu'))
model.add(Dense(16, input_shape=(1,), kernel_initializer=initializers.constant(0.0), bias_initializer=initializers.constant(0.0)))
model.add(Activation('relu'))
model.add(Dense(2, kernel_initializer=initializers.constant(0.0), bias_initializer=initializers.constant(0.0)))
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(loss='mean_squared_error', optimizer=adam)

I then created some dummy data for training:

inputs = np.zeros((10, 1), dtype=np.float32)
targets = np.zeros((10, 2), dtype=np.float32)

for i in range(10):
    inputs[i] = i / 10.0
    targets[i, 0] = 0.1
    targets[i, 1] = 0.01 * i

And finally, I trained with minibatches in a loop, whilst testing on the training data:

while True:

    loss = model.train_on_batch(inputs, targets)

    test_outputs = model.predict(inputs)

    print test_outputs

The problem is, the outputs printed out are as follows:

[0.1, 0.045]
[0.1, 0.045]
[0.1, 0.045]
.....
.....
.....

So, whilst the first dimension is correct (0.1), the second dimension is not correct. The second dimension should be [0.01, 0.02, 0.03, .....]. So in fact, the output from the network (0.45) is simply the average of what all the values in the second dimension should be.

What am I doing wrong?

Exocentric answered 15/5, 2017 at 17:0 Comment(5)
You're probably training too little. Often you need thousands of samples to train a network properly.Government
It would be a very good idea to normalize your output or use some activation function, because you're expecting very small results, while your layers use "relu" and the last one uses no activation at all.Government
You trained for only one epoch/batch, that is not going to work. You should train until the loss converges to a low value.Prohibition
Thanks for these comments. I did train until the loss converged, over multiple epochs, so I don't think that these issues are the cause of the results. It is a very simple function to learn, and there is no noise in the training data, therefore it should not need that much training data. It seems to be more fundamental with how I am setting up a multi-dimensional regression loss. Any ideas....?Exocentric
@Exocentric There is no evidence of anything you just mentioned in your question, provide such information, like what loss value it converged to, what's the validation loss, etc. Your assumption about how much data you need is plain wrong, in this case there are more parameters in your model than samples in your data.Prohibition
E
10

The problem is, that you are initializing all the weights with zero. The problem is, that if all weights are the same, then all the gradients are the same. So it is as if you have a network with a single neuron on every layer. Remove that so that the default random initialization is used and it works:

model = Sequential()
model.add(Dense(16, input_shape=(1,)))
model.add(Activation('relu'))
model.add(Dense(16, input_shape=(1,)))
model.add(Activation('relu'))
model.add(Dense(2))
model.compile(loss='mean_squared_error', optimizer='Adam')

The result after 1000 epochs:

Epoch 1000/1000
10/10 [==============================] - 0s - loss: 5.2522e-08

In [59]: test_outputs
Out[59]:
array([[ 0.09983768,  0.00040025],
       [ 0.09986718,  0.010469  ],
       [ 0.09985521,  0.02051429],
       [ 0.09984323,  0.03055958],
       [ 0.09983127,  0.04060487],
       [ 0.09995781,  0.05083206],
       [ 0.09995599,  0.06089856],
       [ 0.09995417,  0.07096504],
       [ 0.09995237,  0.08103154],
       [ 0.09995055,  0.09109804]], dtype=float32)
Everick answered 16/5, 2017 at 9:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.