Pre-processing before digit recognition with KNN classifier
Asked Answered
B

3

15

Right now I'm trying to create digit recognition system using OpenCV. There are many articles and examples in WEB (and even on StackOverflow). I decided to use KNN classifier because this solution is the most popular in WEB. I found a database of handwritten digits with a training set of 60k examples and with error rate less than 5%.

I used this tutorial as an example of how to work with this database using OpenCV. I'm using exactly same technique and on test data (t10k-images.idx3-ubyte) I've got 4% error rate. But when I try to classify my own digits I've got much bigger error. For example:

  • enter image description here is recognized as 7
  • enter image description here and enter image description here are recognized as 5
  • enter image description here and enter image description here are recognized as 1
  • enter image description here is recognized as 8

And so on (I can upload all images if it's needed).

As you can see all digits have good quality and are easily-recognizable for human.

So I decided to do some pre-processing before classifying. From the table on MNIST database site I found that people are using deskewing, noise removal, blurring and pixel shift techniques. Unfortunately almost all links to the articles are broken. So I decided to do such pre-processing by myself, because I already know how to do that.

Right now, my algorithm is the following:

  1. Erode image (I think that my original digits are too
    rough).
  2. Remove small contours.
  3. Threshold and blur image.
  4. Center digit (instead of shifting).

I think that deskewing is not needed in my situation because all digits are normally rotated. And also I have no idea how to find a right rotation angle. So after this I've got these images:

  • enter image description here is also 1
  • enter image description here is 3 (not 5 as it used to be)
  • enter image description here is 5 (not 8)
  • List item is 7 (profit!)

So, such pre-processing helped me a bit, but I need better results, because in my opinion such digits should be recognized without problems.

Can anyone give me any advice with pre-processing? Thanks for any help.

P.S. I can upload my source (c++) code.

Burnett answered 6/5, 2013 at 15:25 Comment(6)
Well, your training data is handwritten digits, but these are printed digits. Perhaps train with printed digits instead?Compulsion
@DavidBrown I thought about it but where can I found such big (60k) database? Create by myself?Burnett
@ArtemStorozhuk, use the fonts installed on your computer as the training sets.Futtock
@Futtock wow, I didn't know that I can find them on my PC! Always thought about Google. You think I can train such powerful db by myself? Are handwritten digits so different to normal printed digits?Burnett
@ArtemStorozhuk, your results are strong evidence that they are quite different.Futtock
@DavidBrown thanks to your tip - it helped me to solve my problem (see answer if you want).Burnett
B
4

I realized my mistake - it wasn't connected with pre-processing at all (thanks to @DavidBrown and @John). I used handwritten dataset of digits instead of printed (capitalized). I didn't find such database in the web so I decided to create it by myself. I have uploaded my database to the Google Drive.

And here's how you can use it (train and classify):

int digitSize = 16;
//returns list of files in specific directory
static vector<string> getListFiles(const string& dirPath)
{
    vector<string> result;
    DIR *dir;
    struct dirent *ent;
    if ((dir = opendir(dirPath.c_str())) != NULL)
    {
        while ((ent = readdir (dir)) != NULL)
        {
            if (strcmp(ent->d_name, ".") != 0 && strcmp(ent->d_name, "..") != 0 )
            {
                result.push_back(ent->d_name);
            }
        }
        closedir(dir);
    }
    return result;
}

void DigitClassifier::train(const string& imagesPath)
{
    int num = 510;
    int size = digitSize * digitSize;
    Mat trainData = Mat(Size(size, num), CV_32FC1);
    Mat responces = Mat(Size(1, num), CV_32FC1);

    int counter = 0;
    for (int i=1; i<=9; i++)
    {
        char digit[2];
        sprintf(digit, "%d/", i);
        string digitPath(digit);
        digitPath = imagesPath + digitPath;
        vector<string> images = getListFiles(digitPath);
        for (int j=0; j<images.size(); j++)
        {
            Mat mat = imread(digitPath+images[j], 0);
            resize(mat, mat, Size(digitSize, digitSize));
            mat.convertTo(mat, CV_32FC1);
            mat = mat.reshape(1,1);
            for (int k=0; k<size; k++)
            {
                trainData.at<float>(counter*size+k) = mat.at<float>(k);
            }
            responces.at<float>(counter) = i;
            counter++;
        }
    }
    knn.train(trainData, responces);
}

int DigitClassifier::classify(const Mat& img) const
{
    Mat tmp = img.clone();

    resize(tmp, tmp, Size(digitSize, digitSize));

    tmp.convertTo(tmp, CV_32FC1);

    return knn.find_nearest(tmp.reshape(1, 1), 5);
}
Burnett answered 12/5, 2013 at 16:43 Comment(0)
A
1

5 & 6 , 1 & 7, 9 & 8 are recognized as the same because central points of classes are too similar. What about this ?

  • Apply connected component labeling method to digits for getting real boundaries of digits and crop images over these boundaries. So, you will work on more correct area and central points are normalized.
  • Then divide digits into two parts as horizontally. (For example you will have two circles after dividing "8")

As a result, "9" and "8" are more recognizable as well as "5" and "6". Upper parts will be same but lower parts are different.

Ailsun answered 9/5, 2013 at 4:46 Comment(0)
B
0

I can not give you a better answer than your own answer, but I would like to contribute with an advise. You could improve your digits recognition system on the following way:

  • Apply over the white and black patch an skeletonization process.

  • After that, apply distance transform.

On this way you can improve results of the classifier when digits are not exactly centered or they are not exactly the same, morphologically speaking.

Blum answered 13/2, 2017 at 16:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.