How to increase validation accuracy with deep neural net?
Asked Answered
P

2

15

I am trying to build a 11 class image classifier with 13000 training images and 3000 validation images. I am using deep neural network which is being trained using mxnet. Training accuracy is increasing and reached above 80% but validation accuracy is coming in range of 54-57% and its not increasing. What can be the issue here? Should I increase the no of images?

Phlox answered 4/5, 2016 at 7:7 Comment(3)
Did the validation accuracy increase step by step till it got fixed at 54-57%. Or was it almost the same from the very beginning?Bartlet
No validation accuracy was increasing step by step and then it got fixed at 54-57%Phlox
how did you compute the training accuracy? Did you compute it for each batch you trained with? Or for the entire training set?Markova
D
45

The issue here is that your network stop learning useful general features at some point and start adapting to peculiarities of your training set (overfitting it in result). You want to 'force' your network to keep learning useful features and you have few options here:

  1. Use weight regularization. It tries to keep weights low which very often leads to better generalization. Experiment with different regularization coefficients. Try 0.1, 0.01, 0.001 and see what impact they have on accuracy.
  2. Corrupt your input (e.g., randomly substitute some pixels with black or white). This way you remove information from your input and 'force' the network to pick up on important general features. Experiment with noising coefficients which determines how much of your input should be corrupted. Research shows that anything in the range of 15% - 45% works well.
  3. Expand your training set. Since you're dealing with images you can expand your set by rotating / scaling etc. your existing images (as suggested). You could also experiment with pre-processing your images (e.g., mapping them to black and white, grayscale etc. but the effectiveness of this technique will depend on your exact images and classes)
  4. Pre-train your layers with denoising critera. Here you pre-train each layer of your network individually before fine tuning the entire network. Pre-training 'forces' layers to pick up on important general features that are useful for reconstructing the input signal. Look into auto-encoders for example (they've been applied to image classification in the past).
  5. Experiment with network architecture. Your network might not have sufficient learning capacity. Experiment with different neuron types, number of layers, and number of hidden neurons. Make sure to try compressing architectures (less neurons than inputs) and sparse architectures (more neurons than inputs).

Unfortunately the process of training network that generalizes well involves a lot of experimentation and almost brute force exploration of parameter space with a bit of human supervision (you'll see many research works employing this approach). It's good to try 3-5 values for each parameter and see if it leads you somewhere.

When you experiment plot accuracy / cost / f1 as a function of number of iterations and see how it behaves. Often you'll notice a peak in accuracy for your test set, and after that a continuous drop. So apart from good architecture, regularization, corruption etc. you're also looking for a good number of iterations that yields best results.

One more hint: make sure each training epochs randomize the order of images.

Deceleron answered 4/5, 2016 at 9:34 Comment(4)
Thanks for the answer. I am using weight regularization with 0.0001. Now should I retrain the model with different values from start or resume training with a model saved at some epoch with changed regularization value. I am going to try few things and play with some parameter values also I am going to increase my training images.Phlox
Try different values from start, don't use the saved model. And try also bigger values for the regularization coefficient: 0.001, 0.01, 0.1.Deceleron
I have tried with 0.001 but now model is not converging. Last 10 epochs model trianing and validation accuracy are coming between 9-10% . Also I am using dropout in my neural net thats kind of regularization .Phlox
Please answer this if you want https://mcmap.net/q/821810/-training-accuracy-on-sgdMarkova
B
3

This clearly looks like a case where the model is overfitting the Training set, as the validation accuracy was improving step by step till it got fixed at a particular value. If the learning rate was a bit more high, you would have ended up seeing validation accuracy decreasing, with increasing accuracy for training set.

Increasing the number of training set is the best solution to this problem. You could also try applying different transformations (flipping, cropping random portions from a slightly bigger image)to the existing image set and see if the model is learning better.

Bartlet answered 4/5, 2016 at 8:10 Comment(2)
I think with a high learning rate training accuracy too will decrease. I have confirmed it. But yes its a case of overfitting and I am just wondering why its happening as I have selected each image myself and if it can recognize a training image accurately it should also recognize validation image too with kind of same accuracy. Training and validation images are very similar.Phlox
Assuming training and validation images to be "very similar" is a vague idea of interpretting things. How about trying to keep the exact same training image for validation? Does that give the same accuracy as that of training?Bartlet

© 2022 - 2024 — McMap. All rights reserved.