Pre-processing before digit recognition for NN & CNN trained with MNIST dataset

Asked 13/1, 2015 at 15:14 Answered 24/3, 2020 at 4:58

Solved python matlab ocr image-recognition mnist

I'm trying to classify handwriting digits, written by myself and a few friends, by usign NN and CNN. In order to train the NN, MNIST dataset is used. The problem is the NN trained with MNIST dataset does not give satisfying test results on my dataset. I've used some libraries on Python and MATLAB with different settings as listed below.

On Python I've used this code with setting;

3-layers NN with # of inputs = 784, # of hidden neurons = 30, # of outputs = 10
Cost function = cross entropy
Number of Epochs = 30
Batch size = 10
Learning rate = 0.5

it is trained with MNIST training set, and test results are as follows:

test result on MNIST = 96% test result on my own dataset = 80%

On MATLAB I've used deep learning toolbox with various setting, normalization included, similar to above and best accuracy of NN is around 75%.Both NN and CNN are used on MATLAB.

I've tried to resemble my own dataset to MNIST. The results above collected from pre-processed dataset. Here is the pre-processes applied to my dataset:

Each digit is cropped separately and resized to 28 x 28 by usign bicubic interpolation
Pathces are centered with the mean values in MNIST by usign bounding box on MATLAB
Background is 0 and highest pixel value is 1 as in MNIST

I couldn't know what to do more. There are still some differences like contrast etc., but contrast enhancement trials couldn't increase the accuracy.

Here is some digits from MNIST and my own dataset to compare them visually.

MNIST digits

my own dataset

As you may see, there is a clear contrast difference. I think the accuracy problem is because of the lack of similarity between MNIST and my own dataset. How can I handle this issue?

There is a similar question in here, but his dataset is collection of printed digits, not like mine.

Edit: I've also tested binarized verison of my own dataset on NN trained with binarized MNIST and default MNIST. Binarization threshold is 0.05.

Here is an example image in matrix form from MNIST dataset and my own dataset, respectively. Both of them are 5.

MNIST:

  Columns 1 through 10

         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0    0.1176    0.1412
         0         0         0         0         0         0         0    0.1922    0.9333    0.9922
         0         0         0         0         0         0         0    0.0706    0.8588    0.9922
         0         0         0         0         0         0         0         0    0.3137    0.6118
         0         0         0         0         0         0         0         0         0    0.0549
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0    0.0902    0.2588
         0         0         0         0         0         0    0.0706    0.6706    0.8588    0.9922
         0         0         0         0    0.2157    0.6745    0.8863    0.9922    0.9922    0.9922
         0         0         0         0    0.5333    0.9922    0.9922    0.9922    0.8314    0.5294
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0

  Columns 11 through 20

         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0    0.0118    0.0706    0.0706    0.0706    0.4941    0.5333    0.6863    0.1020
    0.3686    0.6039    0.6667    0.9922    0.9922    0.9922    0.9922    0.9922    0.8824    0.6745
    0.9922    0.9922    0.9922    0.9922    0.9922    0.9922    0.9922    0.9843    0.3647    0.3216
    0.9922    0.9922    0.9922    0.9922    0.7765    0.7137    0.9686    0.9451         0         0
    0.4196    0.9922    0.9922    0.8039    0.0431         0    0.1686    0.6039         0         0
    0.0039    0.6039    0.9922    0.3529         0         0         0         0         0         0
         0    0.5451    0.9922    0.7451    0.0078         0         0         0         0         0
         0    0.0431    0.7451    0.9922    0.2745         0         0         0         0         0
         0         0    0.1373    0.9451    0.8824    0.6275    0.4235    0.0039         0         0
         0         0         0    0.3176    0.9412    0.9922    0.9922    0.4667    0.0980         0
         0         0         0         0    0.1765    0.7294    0.9922    0.9922    0.5882    0.1059
         0         0         0         0         0    0.0627    0.3647    0.9882    0.9922    0.7333
         0         0         0         0         0         0         0    0.9765    0.9922    0.9765
         0         0         0         0    0.1804    0.5098    0.7176    0.9922    0.9922    0.8118
         0         0    0.1529    0.5804    0.8980    0.9922    0.9922    0.9922    0.9804    0.7137
    0.0941    0.4471    0.8667    0.9922    0.9922    0.9922    0.9922    0.7882    0.3059         0
    0.8353    0.9922    0.9922    0.9922    0.9922    0.7765    0.3176    0.0078         0         0
    0.9922    0.9922    0.9922    0.7647    0.3137    0.0353         0         0         0         0
    0.9922    0.9569    0.5216    0.0431         0         0         0         0         0         0
    0.5176    0.0627         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0

  Columns 21 through 28

         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
    0.6510    1.0000    0.9686    0.4980         0         0         0         0
    0.9922    0.9490    0.7647    0.2510         0         0         0         0
    0.3216    0.2196    0.1529         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
    0.2510         0         0         0         0         0         0         0
    0.0078         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0

My own dataset:

  Columns 1 through 10

         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0    0.4000    0.5569
         0         0         0         0         0         0         0         0    0.9961    0.9922
         0         0         0         0         0         0         0         0    0.6745    0.9882
         0         0         0         0         0         0         0         0    0.0824    0.8745
         0         0         0         0         0         0         0         0         0    0.4784
         0         0         0         0         0         0         0         0         0    0.4824
         0         0         0         0         0         0         0         0    0.0824    0.8745
         0         0         0         0         0         0         0    0.0824    0.8392    0.9922
         0         0         0         0         0         0         0    0.2392    0.9922    0.6706
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0    0.4431    0.3608
         0         0         0         0         0         0         0    0.3216    0.9922    0.5922
         0         0         0         0         0         0         0    0.3216    1.0000    0.9922
         0         0         0         0         0         0         0         0    0.2784    0.5922
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0

  Columns 11 through 20

         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0    0.2000    0.5176    0.8392    0.9922    0.9961    0.9922    0.7961    0.6353
    0.7961    0.7961    0.9922    0.9882    0.9922    0.9882    0.5922    0.2745         0         0
    0.9569    0.7961    0.5569    0.4000    0.3216         0         0         0         0         0
    0.7961         0         0         0         0         0         0         0         0         0
    0.9176    0.1176         0         0         0         0         0         0         0         0
    0.9922    0.1961         0         0         0         0         0         0         0         0
    0.9961    0.3569    0.2000    0.2000    0.2000    0.0392         0         0         0         0
    0.9922    0.9882    0.9922    0.9882    0.9922    0.6745    0.3216         0         0         0
    0.7961    0.6353    0.4000    0.4000    0.7961    0.8745    0.9961    0.9922    0.2000    0.0392
         0         0         0         0         0    0.0784    0.4392    0.7529    0.9922    0.8314
         0         0         0         0         0         0         0         0    0.4000    0.7961
         0         0         0         0         0         0         0         0         0    0.0784
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0    0.0824    0.4000    0.4000    0.7176
    0.9176    0.5961    0.6000    0.7569    0.6784    0.9922    0.9961    0.9922    0.9961    0.8353
    0.5922    0.9098    0.9922    0.8314    0.7529    0.5922    0.5137    0.1961    0.1961    0.0392
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0         0         0

  Columns 21 through 28

         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
    0.1608         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
    0.1608         0         0         0         0         0         0         0
    0.9176    0.2000         0         0         0         0         0         0
    0.8353    0.9098    0.3216         0         0         0         0         0
    0.2431    0.7961    0.9176    0.4392         0         0         0         0
         0    0.0784    0.8353    0.9882         0         0         0         0
         0         0    0.6000    0.9922         0         0         0         0
         0    0.1608    0.9137    0.8314         0         0         0         0
    0.1216    0.6784    0.9569    0.1569         0         0         0         0
    0.9137    0.8314    0.3176         0         0         0         0         0
    0.5569    0.0784         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0
         0         0         0         0         0         0         0         0

Houstonhoustonia answered 13/1, 2015 at 15:14 Comment(20)

This topic is too specific for SO, may be someone can answer and I wish you luck but I would say you have better chances posting on reddit.com/r/MachineLearning or stats.stackexchange.com/questions/tagged/machine-learning – Vogue 13/1, 2015 at 15:38

It looks like your digits have a lot more variation in brightness. Have you tried quantising the image....i.e. x(x>0)=255? You could also try a median filter to get rid of any salt&pepper noise introduced by compression. – Drusi 13/1, 2015 at 15:46

@Drusi I've applied x(x<0.05) = 0, because bicubic interpolation introduced some offset at background which is not exist in MNIST. I've also tried binarized version of my dataset to test on NN trained with binarized MNIST and normal MNIST. The accuracy is dropped in both of the cases. Besides, contrast enhancement didn't increase accuracy, even decreased! I've checked the processed pathces, and there was no salt&pepper noise. – Houstonhoustonia 13/1, 2015 at 16:4

So the data in the MNIST set are normalised [0 1]....have you tried the same thing with your data? Probably won't make a difference but worth a shot. – Drusi 13/1, 2015 at 16:24

@Drusi data in MNIST is double and varies between [0 1] which is the same with my own dataset. – Houstonhoustonia 13/1, 2015 at 16:27

Can you specify the setting for you supervised DNN? How many neurons per layer? I suggest input layer same as number of pixels, output layer 10 (as there are 10 digits) and a total of 3 layers. If you can achieve 75% accuracy, it is probably your DNN. – Many 13/1, 2015 at 16:32

@qmeeeeeee I've already shared Python configurations and for the MATLAB it is similar. Also, I've tried different setting like # of hidden neurons, cost function, Batch size etc. Every time MNIST test set gives result above 95% because test set is similar to MNIST training set even thought their subjects are different. That shows accuracy problem is not because of architecture of NN but because peculiarity of my own data set with respect to MNIST. Visually, my dataset is already different from MNIST.I've applied some transformation to make them similar. It increased accuracy but not sufficient – Houstonhoustonia 13/1, 2015 at 16:41

Can you give matrices (perhaps rounded to 2 or 3dp to save space, or perhaps a subset including some small interesting area) for two digits in same class from each data set? The images don't look like they are normalised in same way, and the numbers might help show how. – Pileous 13/1, 2015 at 17:3

You are making sure with the MNIST set that you haven't contaminated your test data right? – Drusi 13/1, 2015 at 17:10

@NeilSlater I've added images from each dataset in matrix format. – Houstonhoustonia 13/1, 2015 at 17:14

@Drusi I didn't make any augmentation on MNIST set and training is done only by using MNIST training set. MNIST test set and my own dataset is tested separately. – Houstonhoustonia 13/1, 2015 at 17:17

I cannot see anything obviously different with the matrix data - except perhaps line thickness. The values display differently in your renderings (different grey and white values). Is that an artefact of how you generated the graphics? – Pileous 13/1, 2015 at 18:59

@NeilSlater Yes, it displays digits in a washed out way. Black background turned into gray. I've used 'display_network' to display many digits togerther. It is a part of stanford ufdl course in here. ufldl.stanford.edu/wiki/index.php/Exercise:Sparse_Autoencoder. I've aware of the line thinkness. Should I apply a transform for it? I didn't check general thickness of MNIST, but my digits are thin in general. – Houstonhoustonia 13/1, 2015 at 19:6

If I read your comment correctly, you are training only on the MNIST set....and then testing both your test data and the MNIST test data against the MNIST training data? If thats the case then its not surprising you get better accuracy testing the MNIST data. – Drusi 14/1, 2015 at 11:17

@Drusi yes, it is not surprising. My question is how can I resemble any kind of writing to MNIST so that it can be classified with a classifier that trained with MNIST. Otherwise, I need to include the other type of writings to the training set. As you can see from images above, my dataset has thinner lines because they are written with smartphone pen. It is not common in MNIST as far as I see. – Houstonhoustonia 14/1, 2015 at 11:23

Must your training data only be from MNIST? It seems like a tall order for the NN to recognise text when it hasn't seen any examples. As an anlogy, speech recognisers typically are trainined on 1000's of hours of as many different speakers as possible....they simply wouldn't work otherwise. – Drusi 14/1, 2015 at 12:4

@Drusi Yes, the classifier is adopted to the samples in MNIST. That is why it is not good on my dataset, but I think there is a way to resemble test data into training data like normalizing test data with mean and std of training data. I will also try to train NN with only thin digits in MNIST. That can show me if the drop of accuracy is related with thickness of the digits. – Houstonhoustonia 14/1, 2015 at 12:16

@Houstonhoustonia this is an old one, but I'm facing the exact same scenario (digits captured with a pen, differences in thickness, differences in anti-aliasing, etc). Did you manage to come up with a general way of making your captured data work? Published this a couple of days ago, doesn't seem to be getting much attention: stats.stackexchange.com/questions/293989/… – Springwood 26/7, 2017 at 16:30

@Springwood I didn't work on it much, but the exact problem is distribution discrepancy between two datasets. You can check 'domain adaptation' to learn more about it. For simple solution if you have enough data you could include some of it to 'mnist' training so that the network get used to different domain. – Houstonhoustonia 27/7, 2017 at 7:55

@Houstonhoustonia I don't know if this is still of your interest, but I've found this article about MNIST pre-processing, and the author claims to have achieved good results with it. Is not so different from what you already tried but anyway here it goes: medium.com/@o.kroeger/… – Springwood 28/7, 2017 at 2:6

So what you are looking for is a generalised way of normalising you test data so that it can be compared against the MNIST training data. Perhaps you could first use a technique to normalise the MNIST training data into a standard format, then train your CNN, then normalise you test data using the same process, then apply the CNN for recognition.

Have you seen this paper? It uses moment based image normalisation. It is word level, so not quite what you are doing, but should be easy enough to implement.

Moment-based Image Normalization for Handwritten Text Recognition (Kozielski et al.):

Drusi answered 14/1, 2015 at 12:16 Comment(1)

That is the closest answer to my question. I will check that out. – Houstonhoustonia 15/1, 2015 at 14:11

You could take the mnist trained cnn and try retraining on a subset of your samples. Apply blurs and small.roto-translations to increase datasize.

Angelicaangelico answered 13/1, 2015 at 22:36 Comment(1)

I've collected 120 digits from 4 subject by using Samsung Note3 with its pen. So pen thickness didn't differ in general. Your suggestion will probably increase accuracy because MNIST data has various type of digits and not similar to my own dataset in general. However, my ultimate aim is detection of handwritten letters and digits accurately from any image. It can be an image of blackboard in a class. In that case, CNN or NN trained with MNIST or dataset colected via a smartphone pen will yield lower accuracy again. I need a general pre-process that can increase resemblance with training set. – Houstonhoustonia 13/1, 2015 at 22:48

I wonder if you have only used the train/test set or partitioned your data into train/dev/test set. In the second case make sure the dev and test set come from same distribution.In either case, the model trains in the training set and tries to generalize the results to the test set.

It seems to be a high variance problem. However, since the dataset you created is from different distribution, I believe you have a case of data mismatch. The dataset you prepared might be somewhat difficult(being from different distribution) than the training set you obtained from the MNIST database and the model have never seen the dataset of that difficulty. So the model is not being able to generalize well. This problem is well addressed by Ng's lecture in model optimization(Mismatch training and dev/test set).

A simple solution could be to mix a portion of your dataset(abt 50% or more) with the MNIST training set and a portion with with dev/test set and retrain the model. This lets your model to generalize well to the difficult dataset.. Besides using elastic distortion or other augmentation techniques to augment the data might help as it brings variation to the dataset and increase your data volume.

Other methods to better optimize your model could be using regularization techniques like Dropouts

Sixpack answered 24/3, 2020 at 4:58 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags