Google Vision API does not recognize single digits

About

Asked 20/3, 2018 at 14:12 Answered 28/5, 2019 at 17:23

google-cloud-platform ocr google-cloud-vision text-recognition

I have a project that make use of Google Vision API DOCUMENT_TEXT_DETECTION in order to extract text from document images.

Often the API has troubles in recognizing single digits, as you can see in this image:

I suppose that the problem could be related to some algorithm of noise removal, that recognizes isolated single digits as noise. Is there a way to improve Vision response in these situations? (for example managing noise threshold or others parameters)

At other times Vision confuses digits with letters:

But if I specify as parameter languageHints = 'en' or 'mt' these digits are ignored by the ocr. Is there a way to force the recognition of digits or latin characters?

Gull answered 20/3, 2018 at 14:12 Comment(5)

I don't know exact reasons, but it seems there's also a problem with block sizes - they are too big - so some numbers can be missed / mis-interpreted. Look for an option for controlling segment sizes, if there is one – Limulus 20/3, 2018 at 14:38

You can try to use TEXT_DETECTION. As explained in the documentation, DOCUMENT_TEXT_DETECTION is optimized for dense text. The images that you used seem not be the case. – Brister 27/3, 2018 at 15:22

thanks @enlelin Unfortunately I need to extract text from written documents, that often have zones with different text density. In my case DOCUMENT_TEXT_DETECTION works significantly better, but has troubles in recognizing isolate characters. – Gull 29/3, 2018 at 16:54

Did you find a way to fix this? – Divebomb 10/7, 2018 at 10:19

I am experiencing this problem also. Anyone who fix this already? Thanks – Hideout 18/7, 2019 at 6:16

Unfortunately I think the Vision API is optimized for both ends of the spectrum -- dense text (DOCUMENT_TEXT_DETECTION) on one end, and arbitrary bits of text (TEXT_DETECTION) on the other. As you noted in the comments, the regular TEXT_DETECTION works better for these stray single digits while DOCUMENT_TEXT_DETECTION works better overall.

As far as I've heard, there are no current plans to try to cover both of these in a single way, but it's possible that this could improve in the future.

I think there have been other requests to do more fine-tuning and hinting on what you're looking to detect (e.g., here and here), but this doesn't seem to be available yet. Perhaps in the future you'll be able to provide more hints on the format of the text that you're looking to find in images (e.g., phone numbers, single digits, etc).

Stubbed answered 28/5, 2019 at 17:23 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags