Scikit-learn SVM digit recognition

Asked 22/7, 2016 at 6:35 Answered 15/4, 2017 at 19:6

I want to make a program to recognize the digit in an image. I follow the tutorial in scikit learn .

I can train and fit the svm classifier like the following.

First, I import the libraries and dataset

from sklearn import datasets, svm, metrics

digits = datasets.load_digits()
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))

Second, I create the SVM model and train it with the dataset.

classifier = svm.SVC(gamma = 0.001)
classifier.fit(data[:n_samples], digits.target[:n_samples])

And then, I try to read my own image and use the function predict() to recognize the digit.

Here is my image:

I reshape the image into (8, 8) and then convert it to a 1D array.

img = misc.imread("w1.jpg")
img = misc.imresize(img, (8, 8))
img = img[:, :, 0]

Finally, when I print out the prediction, it returns [1]

predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] )))
print predicted

Whatever I user others images, it still returns [1]

When I print out the "default" dataset of number "9", it looks like:

My image number "9" :

You can see the non-zero number is quite large for my image.

I dont know why. I am looking for help to solve my problem.

Hierarch answered 22/7, 2016 at 6:35 Comment(0)

My best bet would be that there is a problem with your data types and array shapes.

It looks like you are training on numpy arrays that are of the type np.float64 (or possibly np.float32 on 32 bit systems, I don't remember) and where each image has the shape (64,).

Meanwhile your input image for prediction, after the resizing operation in your code, is of type uint8 and shape (1, 64).

I would first try changing the shape of your input image since dtype conversions often just work as you would expect. So change this line:

predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] )))

to this:

predicted = classifier.predict(img.reshape(img.shape[0]*img.shape[1]))

If that doesn't fix it, you can always try recasting the data type as well with

img = img.astype(digits.images.dtype).

I hope that helps. Debugging by proxy is a lot harder than actually sitting in front of your computer :)

Edit: According to the SciPy documentation, the training data contains integer values from 0 to 16. The values in your input image should be scaled to fit the same interval. (http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits)

Typewriting answered 25/7, 2016 at 20:32 Comment(2)

Thank you your reply. I tried your method and it doesnt work. I found out that the array elements of my image are much greater than the training dataset. I have updated the question – Hierarch 26/7, 2016 at 2:40

Ah right, it looks like you have to scale them to the 0-16 range: scikit-learn.org/stable/modules/generated/… – Typewriting 26/7, 2016 at 4:9

1) You need to create your own training set - based on data similar to what you will be making predictions. The call to datasets.load_digits() in scikit-learn is loading a preprocessed version of the MNIST Digits dataset, which, for all we know, could have very different images to the ones that you are trying to recognise.

2) You need to set the parameters of your classifier properly. The call to svm.SVC(gamma = 0.001) is just choosing an arbitrary value of the gamma parameter in SVC, which may not be the best option. In addition, you are not configuring the C parameter - which is pretty important for SVMs. I'd bet that this is one of the reasons why your output is 'always 1'.

3) Whatever final settings you choose for your model, you'll need to use a cross-validation scheme to ensure that the algorithm is effectively learning

There's a lot of Machine Learning theory behind this, but, as a good start, I would really recommend to have a look at SVM - scikit-learn for a more in-depth description of how the SVC implementation in sickit-learn works, and GridSearchCV for a simple technique for parameter setting.

Branle answered 27/7, 2016 at 10:49 Comment(0)

It's just a guess but... The Training set from Sk-Learn are black numbers on a white background. And you are trying to predict numbers which are white on a black background...

I think you should either train on your training set, or train on the negative version of your pictures.

Gantline answered 22/7, 2016 at 7:59 Comment(1)

Thanks for your reply. I have tried black background, white number, it still returns 1. – Hierarch 25/7, 2016 at 2:3

If you look at: http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits

you can see that each point in the matrix as a value between 0-16.

You can try to transform the values of the image to between 0-16. I did it and now the prediction works well for the digit 9 but not for 8 and 6. It doesn't give 1 any more.

from sklearn import datasets, svm, metrics
import cv2
import numpy as np

# Load digit database
digits = datasets.load_digits()
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))

# Train SVM classifier
classifier = svm.SVC(gamma = 0.001)
classifier.fit(data[:n_samples], digits.target[:n_samples])

# Read image "9"
img = cv2.imread("w1.jpg")
img = img[:,:,0];
img = cv2.resize(img, (8, 8))

# Normalize the values in the image to 0-16
minValueInImage = np.min(img)
maxValueInImage = np.max(img)
normaliizeImg = np.floor(np.divide((img - minValueInImage).astype(np.float),(maxValueInImage-minValueInImage).astype(np.float))*16)

# Predict
predicted = classifier.predict(normaliizeImg.reshape((1,normaliizeImg.shape[0]*normaliizeImg.shape[1] )))
print predicted

Uncaused answered 27/7, 2016 at 11:36 Comment(0)

I have solved this problem using below methods:

check the number of attributes, too large or too small.
check the scale of your gray value, I change to [0,16].
check data type, I change it to uint8.
check the number of training data, too small or not.

Genny answered 15/4, 2017 at 19:6 Comment(0)

Hi in addition to @carrdelling respond, i will add that you may use the same training set, if you normalize your images to have the same range of value. For example you could binaries your data ( 1 if > 0, 0 else ) or you could divide by the maximum intensity in your image to have an arbitrary interval [0;1].

Tonitonia answered 27/7, 2016 at 12:1 Comment(0)

You probably want to extract features relevant to to your data set from the images and train your model on them. One example I copied from here.

surf = cv2.SURF(400) kp, des = surf.detectAndCompute(img,None)

But the SURF features may not be the most useful or relevant to your dataset and training task. You should try others too like HOG or others.

Remember this more high level the features you extract the more general/error-tolerant your model will be to unseen images. However, you may be sacrificing accuracy in your known samples and test cases.

Salyers answered 30/7, 2016 at 15:7 Comment(0)

Recommended topics

Hot tags