OCR: segmentation of small text
Asked Answered
R

2

7

The problem

I've been building a (very) simple OCR engine. Since I'm trying to classify very small (pixel size) characters, I'm having some difficulties on segmentation. Here's an example, after best-effort image-wide thresholding:

image of problematic segmentation on 63:

What I've tried

Error detection:

  • large horizontal size of the segments. It works, mostly, but fails (false positive) for a few larger characters.
  • classify, and reject on low score. This seems a bit wasteful.

Error correction:

  • add pixels vertically (vertical histogram), find minimum. It cuts many segments on the wrong place, in many of the samples.

What I haven't tried yet

  • Trying to classify on all possible segmentation points (pixels). This would be very wasteful, and be difficult to expand for a 3-merged-characters segment.
  • I've been reading up on morphology approaches to turn the characters into mathematical curves, but I don't know really know where to start, or if it's worth the effort

Where to go from here?

I have no idea. Hence this question :)

Rattish answered 22/12, 2012 at 4:36 Comment(0)
M
6

Lean back and half close your eyes.

63 :-)

Now, if only it was so easy for a computer!

It's tantalisingly close to what double-patterning does (or un-does?) in silicon masks.

I would suggest oversampling (doubling or quadrupling the pixel count in each axis), filtering (probably low pass - or possibly bandpass where the passband = spatial frequency of a line), re-thresholding until they separate. Expensive, so only apply in problem areas.

Money answered 22/12, 2012 at 11:29 Comment(0)
R
3

Reinvent your problem so you do not need segmentation.

Really, for this scale I think you better invest in other approaches. For example, if you OCR on text (do you?) you can use the information of lines (character height). There are not many fonts that can be used for small (yet readable) characters. My approach would be a algorithm that scan lines in scanlines (from left to right, take pixels from top to bottom) and try to find correlations between trained text and scanlines (n, n-1... n-x)

And you probably need the information I the grayscale levels as well, so better not to threshold the images.

Repel answered 22/12, 2012 at 16:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.