Connected Character segmentation in OpenCV
Asked Answered
A

1

6

What is a good method to segment characters that are united as in the following figure, knowing that:

  • characters have this font, but the font size varies based on the image size
  • only isolated groups of characters from the image are connected

enter image description here

Also, how can i detect if in a given bounding box, there are 2 or more letters which are connected?

I tried with checking for width > height for detecting connected characters but it doesn't work for the blue groups in the image.

I also tried a segmentation method based on: Article section 3.4 for separating characters but got poor results.

Anhydrous answered 25/11, 2013 at 13:17 Comment(2)
yes, it doesn't work since it splits letters too much, especially in the case of "u", "n".Anhydrous
Is this solved?Battat
B
7

IDEA: if you have a good ocr already, you can try to apply ocr all these connected components (or contours). If ocr cant detect a letter; than there is not 1 letter, there are 2 or more.

IDEA: check convexity defects of these connected components, the closest defect points are where the bridges are.

IDEA: use a kernel having small width & big height for erosion+dilation (morphological opening)

IDEA: take y-derivative of the image. The smallest contours (or lines) left will be your bridges. Mark them and erase those pixels from the original image.

IDEA: search problem approach: Take 2 letters from alphabet (and this font), connect them horizontally with some tool and use matchShapes method (moment match) of opencv to find if that shape matches with your connected component. Or try to implement autocorrelation.

good luck.

Burgin answered 25/11, 2013 at 19:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.