Can a perceptron be used to detect hand-written digits?
Asked Answered
E

3

9

Let's say I have a small bitmap which contains a single digit (0..9) in hand writing.

Is it possible to detect the digit using a (two-layered) perceptron?

Are there other possibilities to detect single digits from bitmaps besides using neural nets?

Equally answered 16/2, 2009 at 10:55 Comment(0)
C
8

Feeding each pixel of a bitmap directly into a neural network will require a lot of training, and will not work well for handling scaling or rotation of the image.

To help the neural network perform good classification, you need to perform some preprocessing steps.

  • Normalize the image:
    • Adjust the contrast and brightness so that the histogram of the image matches a reference image.
    • Blur the image, to remove noise.
    • Convert it to black & white, using some threshold.
    • Find the bounding box of the shape, scale to a predefined size.
  • Calculate various features of the image that can be used to differentiate one digit from another:
    • The Euler number of the image — tells you how many "holes" there are in the shape (e.g. two holes for the digit 8).
    • The number of white pixels (the area of the digit)
    • The principal components of the set of coordinates of the white pixels — tells you how "elongated" the shape is.
    • ... other features that you can think of that tend to have similar values for similar digits.

The principal components can also be used to normalize rotation of the shape, so that the longest axis is vertical.

The features are what you feed into the neural network for classification, not the pixels.

Cenac answered 16/2, 2009 at 11:27 Comment(1)
I have actually tried to use neural networks to perform a similar task, and I found (so far) that it works better if I give the actual pixels to the network rather than performing calculations and giving the network those calculations. Granted, I may not have chosen good calculations as input. But so far it works decently without them assuming the image is normalized well enough.Colet
F
8

Here is a link to a huge database of handwritten digits. The front page also has relative performance data for many different methods including 2 layer Neural networks. This ought to give you a good start: MNIST digits database and performance

You might also want to check out Geoff Hinton's work on Restricted Boltzmann Machines which he says performs fairly well, and there is a good explanatory lecture on his site (very watchable).

Fitzpatrick answered 16/2, 2009 at 11:2 Comment(0)
C
8

Feeding each pixel of a bitmap directly into a neural network will require a lot of training, and will not work well for handling scaling or rotation of the image.

To help the neural network perform good classification, you need to perform some preprocessing steps.

  • Normalize the image:
    • Adjust the contrast and brightness so that the histogram of the image matches a reference image.
    • Blur the image, to remove noise.
    • Convert it to black & white, using some threshold.
    • Find the bounding box of the shape, scale to a predefined size.
  • Calculate various features of the image that can be used to differentiate one digit from another:
    • The Euler number of the image — tells you how many "holes" there are in the shape (e.g. two holes for the digit 8).
    • The number of white pixels (the area of the digit)
    • The principal components of the set of coordinates of the white pixels — tells you how "elongated" the shape is.
    • ... other features that you can think of that tend to have similar values for similar digits.

The principal components can also be used to normalize rotation of the shape, so that the longest axis is vertical.

The features are what you feed into the neural network for classification, not the pixels.

Cenac answered 16/2, 2009 at 11:27 Comment(1)
I have actually tried to use neural networks to perform a similar task, and I found (so far) that it works better if I give the actual pixels to the network rather than performing calculations and giving the network those calculations. Granted, I may not have chosen good calculations as input. But so far it works decently without them assuming the image is normalized well enough.Colet
H
1

Here is a Matlab example program that uses a trained neural network to detect single digits (image size fixed to 28*28).

Haslam answered 16/2, 2009 at 11:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.