Object Detection for android with tesseract or OpenCV

Asked 21/6, 2013 at 14:23 Answered 28/6, 2013 at 16:12

Solved android opencv computer-vision tesseract text-recognition

I have successfully integrated tesseract into my android app and it reads whatever the image that I capture but with very less accuracy. But most of the time I do not get the correct text after capturing because some text around the region of interest is also getting captured.

All I want to read is all text from a rectangular area, accurately, without capturing the edges of the rectangle. I have done some research and posted on stackoverflow about this two times, but still did not get a happy result!

Following are the 2 posts that I made:

https://stackoverflow.com/questions/16663504/extract-text-from-a-captured-image?noredirect=1#comment23973954_16663504

Extracting information from captured image in android

I am not sure whether to go ahead with tesseract or use openCV

Rumpf answered 21/6, 2013 at 14:23 Comment(11)

If the answers were unsatisfactory, try putting up a bounty. If you go the openCV route, make sure you configure it for the camera you'll be using. – Yeah 21/6, 2013 at 14:28

With tesseract, I have a kind of a rectangular area, so the user will place the area to be captured within that rectangle. But when capturing the image, if you move slightly, the result that you get is completely a garbage value. I think tesseract is not helping me. Could you please provide me some sample code? – Rumpf 21/6, 2013 at 14:33

Haven't played with openCV since my student days, so no, not really... but looking at your other question, lottery tickets might not be teh best thing to try out with. Try blank white paper with big black bold typefont and work from there... Lighting, camera internals, focus - they all get in the way of OCR. – Yeah 21/6, 2013 at 14:42

well I tried that way as well, if the text is on white background then it reads fine. But when I applied to lottery, gives me garbage values most of the time. I also tried with various lighting conditions, even with good lighting conditions, tesseract gives me poor results when the lottery is processed. What should I do? – Rumpf 21/6, 2013 at 14:50

Curse the gods, how dare the lottery people try to make forging/OCRing tickets hard! So, before OCRing the loterry ticket, you need to clean it using a ... RasterizerFilter? In any case, try to filter out the holograms/funny background, use high-contrasting etc and try to pass a filtered input to OCR, rather than trying to make a read-anything OCR. – Yeah 21/6, 2013 at 14:52

Oh this sounds like a great solution. Could you please provide me a technique for this in android, so that I can pass the captured bitmap and remove noise from it and then pass it to tesseract? – Rumpf 21/6, 2013 at 14:57

Yes, programming in Java with a native lib to do the heavy-lifting. – Yeah 21/6, 2013 at 15:1

you mean java in-built functions can be used for this purpose without going for a third party tool? – Rumpf 21/6, 2013 at 15:3

manuscripttranscription.blogspot.com/2013/02/… Alternatively, get more training material so the underlying neural net can better recognize junk from relevant data; or before OCRing - pass the image thru a "noise reductor" NNet, which is trained to clean images and only leave cleared chars. – Yeah 21/6, 2013 at 15:5

No, you will most certainly have to use a third-party lib or roll some code of your own. Not sure if Java facilitates any of this. google.com/… – Yeah 21/6, 2013 at 15:5

let us continue this discussion in chat – Rumpf 21/6, 2013 at 15:8

Including the many links and answers from others, I think it's good to take a step back and note that there are actually two fundamental steps to optical character recognition (OCR):

Text Detection: This is the title and focus of your question, and it is concerned with localizing regions in an image that contain text.
Text Recognition: This is where the actual recognition happens, where the localized image regions from detection get segmented character-by-character and classified. This is also where tools like Tesseract come into play.

Now, there are also two general settings in which OCR is applied:

Controlled: These are images taken from a scanner or similar in-nature where the target is a document and things like perspective, scale, font, orientation, background consistency, etc are pretty docile.
Uncontrolled/Scene: These are the more natural and in-the-wild photos, e.g. those taken from a camera, where you are trying to recognize a street sign, shop name, etc.

Tesseract as-is is most applicable to the "controlled" setting. And in general, but for scene OCR especially, "re-training" Tesseract will not directly improve detection, but may improve recognition.

If you are looking to improve scene text detection, see this work; and if you are looking at improving scene text recognition, see this work. Since you asked about detection, the detection reference uses maximally stable extremal regions (MSER), which has a plethora of implementation resources, e.g. see here.

There's also a text detection project here specifically for Android too:
https://github.com/dreamdragon/text-detection

As many have noted, keep in mind that recognition is still an open research challenge.

Slang answered 28/6, 2013 at 16:12 Comment(1)

Thanks a lot for the time that you spent for answering this this question. You have provided so much important information. I think I will be able to figure out a way. Thanks again! – Rumpf 29/6, 2013 at 16:9

The solution to improving the OCR output is to

either use more training data to train it better
filter it's input using some Linear Filter (grayscaling, high-contrasting, blurring)

In the chat we posted a number of links describing filtering techniques used in OCRing, but sample code wasn't posted.

Some of the links posted were

Improving input for OCR

How to train Tesseract

Text enhancement using asymmetric filters <-- this paper is easily found on google, and should be read fully as it quite clearly illustrates and demonstrates necessary steps before OCR-processing the image.

OCR Classification

Yeah answered 21/6, 2013 at 15:23 Comment(2)

following are links posted by Shark: manuscripttranscription.blogspot.com/2013/02/… cedricve.me/2013/04/12/how-to-train-tesseract scholr.ly/paper/1046523/… – Rumpf 21/6, 2013 at 15:32

Linear Classificator / OCR classification. That's the one I was trying to remember. – Yeah 21/6, 2013 at 17:10

Recommended topics

Hot tags