Feeding each pixel of a bitmap directly into a neural network will require a lot of training, and will not work well for handling scaling or rotation of the image.
To help the neural network perform good classification, you need to perform some preprocessing steps.
- Normalize the image:
- Adjust the contrast and brightness so that the histogram of the image matches a reference image.
- Blur the image, to remove noise.
- Convert it to black & white, using some threshold.
- Find the bounding box of the shape, scale to a predefined size.
- Calculate various features of the image that can be used to differentiate one digit from another:
- The Euler number of the image — tells you how many "holes" there are in the shape (e.g. two holes for the digit 8).
- The number of white pixels (the area of the digit)
- The principal components of the set of coordinates of the white pixels — tells you how "elongated" the shape is.
- ... other features that you can think of that tend to have similar values for similar digits.
The principal components can also be used to normalize rotation of the shape, so that the longest axis is vertical.
The features are what you feed into the neural network for classification, not the pixels.