Oscillating accuracy of CNN training with Tensor Flow for MNIST handwritten digits
Asked Answered
A

1

6

I'm following the tutorial "Deep MNIST for Experts", https://www.tensorflow.org/versions/r0.11/tutorials/mnist/pros/index.html#deep-mnist-for-experts

Using Convolutional Neural Networks, I get an accuracy of 93.49%. Which is in fact low and I'm trying to improve it, but I have a doubt. According to the tutorial,

for i in range(20000):
   batch = mnist.train.next_batch(50)
   if i%100 == 0:
       train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
       print("step %d, training accuracy %g"%(i, train_accuracy))
   train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

The train accuracy is logged after every 100 iterations and seeing the accuracy, it keeps oscillating like increasing and then decreasing.

step 100, training accuracy 0.1
step 200, training accuracy 0.13
step 300, training accuracy 0.12
step 400, training accuracy 0.08
step 500, training accuracy 0.12
step 600, training accuracy 0.05
step 700, training accuracy 0.09
step 800, training accuracy 0.1
step 900, training accuracy 0.12
step 1000, training accuracy 0.09
step 1100, training accuracy 0.11
step 1200, training accuracy 0.09
step 1300, training accuracy 0.11
step 1400, training accuracy 0.06
step 1500, training accuracy 0.09
step 1600, training accuracy 0.14
step 1700, training accuracy 0.07
step 1800, training accuracy 0.08
......
step 19800, training accuracy 0.14
step 19900, training accuracy 0.07

Is there any reason for that? Or is it normal? Then why so? Also, what kind of variables I can change to improve the final accuracy? I've tried changing the learning rate variable already.

Ardyce answered 20/10, 2016 at 13:59 Comment(0)
C
7

Oscillating accuracy is typically caused by a learning_rate that is too high. My first tip would indeed be to lower the learning_rate, did you test multiple learning rates on a logarithmic scale, e.g. 0.1,0.05,0.02,0.01,0.005,0.002,...?

Using drastically smaller learning rates should remove the oscillating accuracy. Also check this answer on Kaggle and the linked document to get a better understanding.

EDIT:

Based on the remark in the comment: this accuracy is measured per batch. Since you are comparing the accuracies on different batches each time (a simple vs. a harder batch) it's normal that you don't get a monotonic increase in accuracy. You can further reduce the oscillations:

  • By increasing the batch size, the fluctuations should decrease: the impact of the difficulty of different examples will be averaged out.

  • You could also calculate the training accuracy over a constant set of examples:

    • Using a validation set

    • Averaging the batch accuracies over all batches in one epoch

    • Actually calculating the accuracy over all examples in the training set after each number of training steps. This off course has a big impact on the training time if you have a large training set.

Crowe answered 20/10, 2016 at 15:49 Comment(3)
Actually, I am using the adamoptimizer with rate 1e-5, any lower or higher, final accuracy seems to decrease. And those links, suggested, it may be due to the batch size. So, I'll change the batch size too.Ardyce
And also, I would like to know, that if generally the accuracy in each step logged is supposed to increase constantly, or such oscillations are normal?Ardyce
Yes, it's normal to have these oscillations on batch accuracy.You are measuring the accuracy on different datasets each time. If you average this out you can get a good idea, but now it's a bit hard to compare accuracies calculated over different datasets. It would be better to also have a separate constant validation on which you calculate the accuracy. What you could also do is average all the batch accuracies over one epoch and only show this average once per epoch...Crowe

© 2022 - 2024 — McMap. All rights reserved.