i started writing Neuronal Networks with tensorflow and there is one Problem i seem to face in each of my example Projects.
My loss allways starts at something like 50 or higher and does not decrease or if it does, it does so slowly that after all my epochs i do not even get near an acceptable loss-rate.
Things it already tried (and did not affect the result very much)
- tested on overfitting, but in the following example you can see that i have 15000 training and 15000 testing-datasets and something like 900 neurons
- tested different optimizers and optimizer-values
- tried increasing the traingdata by using the testdata as trainingdata aswell
- tried increasing and decreasing the batchsize
I created the network on knowledge of https://youtu.be/vq2nnJ4g6N0
But let us have a look on one of my testprojects:
I have a list of names and wanted to assume the gender so my raw data looks like this:
names=["Maria","Paul","Emilia",...]
genders=["f","m","f",...]
For feeding it into the network i transform the names into an array of charCodes (expecting a maxlength of 30) and the gender into a bit array
names=[[77.,97. ,114.,105.,97. ,0. ,0.,...]
[80.,97. ,117.,108.,0. ,0. ,0.,...]
[69.,109.,105.,108.,105.,97.,0.,...]]
genders=[[1.,0.]
[0.,1.]
[1.,0.]]
I built the network with 3 hidden layers [30,20],[20,10],[10,10] and [10,2] for the output layer. All hidden layers have a ReLU as activation function. The output layer has a softmax.
# Input Layer
x = tf.placeholder(tf.float32, shape=[None, 30])
y_ = tf.placeholder(tf.float32, shape=[None, 2])
# Hidden Layers
# H1
W1 = tf.Variable(tf.truncated_normal([30, 20], stddev=0.1))
b1 = tf.Variable(tf.zeros([20]))
y1 = tf.nn.relu(tf.matmul(x, W1) + b1)
# H2
W2 = tf.Variable(tf.truncated_normal([20, 10], stddev=0.1))
b2 = tf.Variable(tf.zeros([10]))
y2 = tf.nn.relu(tf.matmul(y1, W2) + b2)
# H3
W3 = tf.Variable(tf.truncated_normal([10, 10], stddev=0.1))
b3 = tf.Variable(tf.zeros([10]))
y3 = tf.nn.relu(tf.matmul(y2, W3) + b3)
# Output Layer
W = tf.Variable(tf.truncated_normal([10, 2], stddev=0.1))
b = tf.Variable(tf.zeros([2]))
y = tf.nn.softmax(tf.matmul(y3, W) + b)
Now the calculation for the loss, accuracy and the training operation:
# Loss
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
# Accuracy
is_correct = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
# Training
train_operation = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
I train the network in batches of 100
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(150):
bs = 100
index = i*bs
inputBatch = inputData[index:index+bs]
outputBatch = outputData[index:index+bs]
sess.run(train_operation, feed_dict={x: inputBatch, y_: outputBatch})
accuracyTrain, lossTrain = sess.run([accuracy, cross_entropy], feed_dict={x: inputBatch, y_: outputBatch})
if i%(bs/10) == 0:
print("step %d loss %.2f accuracy %.2f" % (i, lossTrain, accuracyTrain))
And i get the following result:
step 0 loss 68.96 accuracy 0.55
step 10 loss 69.32 accuracy 0.50
step 20 loss 69.31 accuracy 0.50
step 30 loss 69.31 accuracy 0.50
step 40 loss 69.29 accuracy 0.51
step 50 loss 69.90 accuracy 0.53
step 60 loss 68.92 accuracy 0.55
step 70 loss 68.99 accuracy 0.55
step 80 loss 69.49 accuracy 0.49
step 90 loss 69.25 accuracy 0.52
step 100 loss 69.39 accuracy 0.49
step 110 loss 69.32 accuracy 0.47
step 120 loss 67.17 accuracy 0.61
step 130 loss 69.34 accuracy 0.50
step 140 loss 69.33 accuracy 0.47
What am i doing wrong?
Why does it start at ~69 in my Project and not lower?
Thank you very much guys!