I'm trying to classify handwriting digits, written by myself and a few friends, by usign NN and CNN. In order to train the NN, MNIST dataset is used. The problem is the NN trained with MNIST dataset does not give satisfying test results on my dataset. I've used some libraries on Python and MATLAB with different settings as listed below.
On Python I've used this code with setting;
- 3-layers NN with # of inputs = 784, # of hidden neurons = 30, # of outputs = 10
- Cost function = cross entropy
- Number of Epochs = 30
- Batch size = 10
- Learning rate = 0.5
it is trained with MNIST training set, and test results are as follows:
test result on MNIST = 96% test result on my own dataset = 80%
On MATLAB I've used deep learning toolbox with various setting, normalization included, similar to above and best accuracy of NN is around 75%.Both NN and CNN are used on MATLAB.
I've tried to resemble my own dataset to MNIST. The results above collected from pre-processed dataset. Here is the pre-processes applied to my dataset:
- Each digit is cropped separately and resized to 28 x 28 by usign bicubic interpolation
- Pathces are centered with the mean values in MNIST by usign bounding box on MATLAB
- Background is 0 and highest pixel value is 1 as in MNIST
I couldn't know what to do more. There are still some differences like contrast etc., but contrast enhancement trials couldn't increase the accuracy.
Here is some digits from MNIST and my own dataset to compare them visually.
As you may see, there is a clear contrast difference. I think the accuracy problem is because of the lack of similarity between MNIST and my own dataset. How can I handle this issue?
There is a similar question in here, but his dataset is collection of printed digits, not like mine.
Edit: I've also tested binarized verison of my own dataset on NN trained with binarized MNIST and default MNIST. Binarization threshold is 0.05.
Here is an example image in matrix form from MNIST dataset and my own dataset, respectively. Both of them are 5.
MNIST:
Columns 1 through 10
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0.1176 0.1412
0 0 0 0 0 0 0 0.1922 0.9333 0.9922
0 0 0 0 0 0 0 0.0706 0.8588 0.9922
0 0 0 0 0 0 0 0 0.3137 0.6118
0 0 0 0 0 0 0 0 0 0.0549
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0.0902 0.2588
0 0 0 0 0 0 0.0706 0.6706 0.8588 0.9922
0 0 0 0 0.2157 0.6745 0.8863 0.9922 0.9922 0.9922
0 0 0 0 0.5333 0.9922 0.9922 0.9922 0.8314 0.5294
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Columns 11 through 20
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0.0118 0.0706 0.0706 0.0706 0.4941 0.5333 0.6863 0.1020
0.3686 0.6039 0.6667 0.9922 0.9922 0.9922 0.9922 0.9922 0.8824 0.6745
0.9922 0.9922 0.9922 0.9922 0.9922 0.9922 0.9922 0.9843 0.3647 0.3216
0.9922 0.9922 0.9922 0.9922 0.7765 0.7137 0.9686 0.9451 0 0
0.4196 0.9922 0.9922 0.8039 0.0431 0 0.1686 0.6039 0 0
0.0039 0.6039 0.9922 0.3529 0 0 0 0 0 0
0 0.5451 0.9922 0.7451 0.0078 0 0 0 0 0
0 0.0431 0.7451 0.9922 0.2745 0 0 0 0 0
0 0 0.1373 0.9451 0.8824 0.6275 0.4235 0.0039 0 0
0 0 0 0.3176 0.9412 0.9922 0.9922 0.4667 0.0980 0
0 0 0 0 0.1765 0.7294 0.9922 0.9922 0.5882 0.1059
0 0 0 0 0 0.0627 0.3647 0.9882 0.9922 0.7333
0 0 0 0 0 0 0 0.9765 0.9922 0.9765
0 0 0 0 0.1804 0.5098 0.7176 0.9922 0.9922 0.8118
0 0 0.1529 0.5804 0.8980 0.9922 0.9922 0.9922 0.9804 0.7137
0.0941 0.4471 0.8667 0.9922 0.9922 0.9922 0.9922 0.7882 0.3059 0
0.8353 0.9922 0.9922 0.9922 0.9922 0.7765 0.3176 0.0078 0 0
0.9922 0.9922 0.9922 0.7647 0.3137 0.0353 0 0 0 0
0.9922 0.9569 0.5216 0.0431 0 0 0 0 0 0
0.5176 0.0627 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Columns 21 through 28
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.6510 1.0000 0.9686 0.4980 0 0 0 0
0.9922 0.9490 0.7647 0.2510 0 0 0 0
0.3216 0.2196 0.1529 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.2510 0 0 0 0 0 0 0
0.0078 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
My own dataset:
Columns 1 through 10
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0.4000 0.5569
0 0 0 0 0 0 0 0 0.9961 0.9922
0 0 0 0 0 0 0 0 0.6745 0.9882
0 0 0 0 0 0 0 0 0.0824 0.8745
0 0 0 0 0 0 0 0 0 0.4784
0 0 0 0 0 0 0 0 0 0.4824
0 0 0 0 0 0 0 0 0.0824 0.8745
0 0 0 0 0 0 0 0.0824 0.8392 0.9922
0 0 0 0 0 0 0 0.2392 0.9922 0.6706
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0.4431 0.3608
0 0 0 0 0 0 0 0.3216 0.9922 0.5922
0 0 0 0 0 0 0 0.3216 1.0000 0.9922
0 0 0 0 0 0 0 0 0.2784 0.5922
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Columns 11 through 20
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0.2000 0.5176 0.8392 0.9922 0.9961 0.9922 0.7961 0.6353
0.7961 0.7961 0.9922 0.9882 0.9922 0.9882 0.5922 0.2745 0 0
0.9569 0.7961 0.5569 0.4000 0.3216 0 0 0 0 0
0.7961 0 0 0 0 0 0 0 0 0
0.9176 0.1176 0 0 0 0 0 0 0 0
0.9922 0.1961 0 0 0 0 0 0 0 0
0.9961 0.3569 0.2000 0.2000 0.2000 0.0392 0 0 0 0
0.9922 0.9882 0.9922 0.9882 0.9922 0.6745 0.3216 0 0 0
0.7961 0.6353 0.4000 0.4000 0.7961 0.8745 0.9961 0.9922 0.2000 0.0392
0 0 0 0 0 0.0784 0.4392 0.7529 0.9922 0.8314
0 0 0 0 0 0 0 0 0.4000 0.7961
0 0 0 0 0 0 0 0 0 0.0784
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0.0824 0.4000 0.4000 0.7176
0.9176 0.5961 0.6000 0.7569 0.6784 0.9922 0.9961 0.9922 0.9961 0.8353
0.5922 0.9098 0.9922 0.8314 0.7529 0.5922 0.5137 0.1961 0.1961 0.0392
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Columns 21 through 28
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.1608 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.1608 0 0 0 0 0 0 0
0.9176 0.2000 0 0 0 0 0 0
0.8353 0.9098 0.3216 0 0 0 0 0
0.2431 0.7961 0.9176 0.4392 0 0 0 0
0 0.0784 0.8353 0.9882 0 0 0 0
0 0 0.6000 0.9922 0 0 0 0
0 0.1608 0.9137 0.8314 0 0 0 0
0.1216 0.6784 0.9569 0.1569 0 0 0 0
0.9137 0.8314 0.3176 0 0 0 0 0
0.5569 0.0784 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0