Object Detection for android with tesseract or OpenCV
Asked Answered
R

2

8

I have successfully integrated tesseract into my android app and it reads whatever the image that I capture but with very less accuracy. But most of the time I do not get the correct text after capturing because some text around the region of interest is also getting captured.

All I want to read is all text from a rectangular area, accurately, without capturing the edges of the rectangle. I have done some research and posted on stackoverflow about this two times, but still did not get a happy result!

Following are the 2 posts that I made:

https://stackoverflow.com/questions/16663504/extract-text-from-a-captured-image?noredirect=1#comment23973954_16663504

Extracting information from captured image in android

I am not sure whether to go ahead with tesseract or use openCV

Rumpf answered 21/6, 2013 at 14:23 Comment(11)
If the answers were unsatisfactory, try putting up a bounty. If you go the openCV route, make sure you configure it for the camera you'll be using.Yeah
With tesseract, I have a kind of a rectangular area, so the user will place the area to be captured within that rectangle. But when capturing the image, if you move slightly, the result that you get is completely a garbage value. I think tesseract is not helping me. Could you please provide me some sample code?Rumpf
Haven't played with openCV since my student days, so no, not really... but looking at your other question, lottery tickets might not be teh best thing to try out with. Try blank white paper with big black bold typefont and work from there... Lighting, camera internals, focus - they all get in the way of OCR.Yeah
well I tried that way as well, if the text is on white background then it reads fine. But when I applied to lottery, gives me garbage values most of the time. I also tried with various lighting conditions, even with good lighting conditions, tesseract gives me poor results when the lottery is processed. What should I do?Rumpf
Curse the gods, how dare the lottery people try to make forging/OCRing tickets hard! So, before OCRing the loterry ticket, you need to clean it using a ... RasterizerFilter? In any case, try to filter out the holograms/funny background, use high-contrasting etc and try to pass a filtered input to OCR, rather than trying to make a read-anything OCR.Yeah
Oh this sounds like a great solution. Could you please provide me a technique for this in android, so that I can pass the captured bitmap and remove noise from it and then pass it to tesseract?Rumpf
Yes, programming in Java with a native lib to do the heavy-lifting.Yeah
you mean java in-built functions can be used for this purpose without going for a third party tool?Rumpf
manuscripttranscription.blogspot.com/2013/02/… Alternatively, get more training material so the underlying neural net can better recognize junk from relevant data; or before OCRing - pass the image thru a "noise reductor" NNet, which is trained to clean images and only leave cleared chars.Yeah
No, you will most certainly have to use a third-party lib or roll some code of your own. Not sure if Java facilitates any of this. google.com/…Yeah
let us continue this discussion in chatRumpf
S
11

Including the many links and answers from others, I think it's good to take a step back and note that there are actually two fundamental steps to optical character recognition (OCR):

  • Text Detection: This is the title and focus of your question, and it is concerned with localizing regions in an image that contain text.
  • Text Recognition: This is where the actual recognition happens, where the localized image regions from detection get segmented character-by-character and classified. This is also where tools like Tesseract come into play.

Now, there are also two general settings in which OCR is applied:

  • Controlled: These are images taken from a scanner or similar in-nature where the target is a document and things like perspective, scale, font, orientation, background consistency, etc are pretty docile.
  • Uncontrolled/Scene: These are the more natural and in-the-wild photos, e.g. those taken from a camera, where you are trying to recognize a street sign, shop name, etc.

Tesseract as-is is most applicable to the "controlled" setting. And in general, but for scene OCR especially, "re-training" Tesseract will not directly improve detection, but may improve recognition.

If you are looking to improve scene text detection, see this work; and if you are looking at improving scene text recognition, see this work. Since you asked about detection, the detection reference uses maximally stable extremal regions (MSER), which has a plethora of implementation resources, e.g. see here.

There's also a text detection project here specifically for Android too:
https://github.com/dreamdragon/text-detection

As many have noted, keep in mind that recognition is still an open research challenge.

Slang answered 28/6, 2013 at 16:12 Comment(1)
Thanks a lot for the time that you spent for answering this this question. You have provided so much important information. I think I will be able to figure out a way. Thanks again!Rumpf
Y
5

The solution to improving the OCR output is to

  • either use more training data to train it better

  • filter it's input using some Linear Filter (grayscaling, high-contrasting, blurring)

In the chat we posted a number of links describing filtering techniques used in OCRing, but sample code wasn't posted.

Some of the links posted were

Improving input for OCR

How to train Tesseract

Text enhancement using asymmetric filters <-- this paper is easily found on google, and should be read fully as it quite clearly illustrates and demonstrates necessary steps before OCR-processing the image.

OCR Classification

Yeah answered 21/6, 2013 at 15:23 Comment(2)
following are links posted by Shark: manuscripttranscription.blogspot.com/2013/02/… cedricve.me/2013/04/12/how-to-train-tesseract scholr.ly/paper/1046523/…Rumpf
Linear Classificator / OCR classification. That's the one I was trying to remember.Yeah

© 2022 - 2024 — McMap. All rights reserved.