Text detection on Seven Segment Display via Tesseract OCR - McMap

About

Text detection on Seven Segment Display via Tesseract OCR

Asked 16/7, 2013 at 9:24 Answered 14/8, 2016 at 3:49

ocr tesseract seven-segment-display

J

3

21

The problem that I am running with is to extract the text out of an image and for this I have used Tesseract v3.02. The sample images from which I have to extract text are related to meter readings. Some of them are with solid sheet background and some of them have LED display. I have trained the dataset for solid sheet background and the results are some how effective.

The major problem I have now is the text images with LED/LCD background which are not recognized by Tesseract and due to this the training set isn't generated.

Can anyone guide me to the right direction on how to use Tesseract with the Seven Segment Display(LCD/LED background) or is there any other alternative that I can use instead of Tesseract.

LED background image 1 LED background image 2 Meter 1 with solid sheet background enter image description here

Jaclynjaco answered 16/7, 2013 at 9:24 Comment(2)

"I have trained the dataset for solid sheet background " .Would you please mind telling, how you achieved this ? – Leicester 12/4, 2016 at 10:0

@Jaclynjaco have you made any progress on this? I am running into the same problem. – Stutzman 3/5, 2018 at 23:48

R

7

https://github.com/upupnaway/digital-display-character-rec/blob/master/digital_display_ocr.py

Did this using openCV and tesseract and the "letsgodigital" trained data

-steps include edge detection and extracting the display using the largest contour. Then threshold image using otsu or binarization and pass it through pytesseracts image_to_string function.

Ridgeway answered 14/8, 2016 at 3:49 Comment(0)

K

5

This seems like an image preprocessing task. Tesseract would really prefer its images to all be white-on-black text in bitmap format. If you give it something that isn't that, it will do its best to convert it to that format. It is not very smart about how to do this. Using some image manipulation tool (I happen to like imagemagick), you need to make the images more to tesseract's satisfaction. An easy first pass might be to do a small-radius gaussian blur, threshold at a pretty low value (you're trying to keep only black, so 15% seems right), and then invert the image.

The hard part then becomes knowing which preprocessing task to do. If you have metadata telling you what sort of display you're dealing with, great. If not, I suspect you could look at image color histograms to at least figure out whether your text is white-on-black or black-on-color. If these are the only scenarios, white-on-black is always solid background, and black-on-color is always seven-segment-display, then you're done. If not, you'll have to be clever. Good luck, and please let us know what you come up with.

Kwakiutl answered 17/7, 2013 at 16:29 Comment(13)

https://mcmap.net/q/661042/-7-segment-display-ocr?rq=1 this stackoverflow question has a link to a c script for reading seven-segment independent of OCR. Probably also worth a look. – Kwakiutl 17/7, 2013 at 16:30

I am using GPUImageLibrary github.com/BradLarson/GPUImage. I did exactly the same as you did. I applied gaussian blur and then instead of inverting I did Sharpened the blurred image and the provided to the gaussian it worked to some extent but for images that I have added on position 4 in question. It fails... what sort of filters should be applied ? – Jaclynjaco 23/7, 2013 at 6:43

Is it possible to remove the background of LED ? – Jaclynjaco 24/7, 2013 at 4:23

The difficult thing about the fourth image is that the background brightness decreases from left to right. I was able to solve this using local adaptive thresholding, called in imagemagick by the function -lat. The idea is to average the pixels in the surrounding area and construct a local threshold value that will separate the foreground from the background. If GPUImageLibrary doesn't have that, it shouldn't be too hard to write yourself. It has the added benefit of still working on flat-background images. On that image, a local adaptive threshold of radius 60-80 pixels worked well. – Kwakiutl 24/7, 2013 at 16:36

Yes you are right, I have applied Gaussaian blur on the image and then applied AdaptiveThreshold the grains or background is removed. – Jaclynjaco 25/7, 2013 at 6:42

But Now I am facing a strange problem which is wrong recognition of the characters i.e. for instance I have attached 5th image it is recognixed wrong by tesseract as it returns "n n g g g q" => and I can't map it to "0 0 0 3 8 9" because of the repeated "g"... any idea how can I fix this ? – Jaclynjaco 25/7, 2013 at 7:23

How are you calling tesseract? From the command line, there are two things you can do. First, make sure it knows there is only one line of text by setting pagesegmode to 7. Second, tell it that every character is a digit by including the config file "digits." The command should look like this: tesseract img.png out -psm 7 digits – Kwakiutl 25/7, 2013 at 16:26

Is this working out ok? I'm worried that Tesseract might still not recognize the split zeroes, but I have a few ideas for how to deal with that, if you need them. – Kwakiutl 31/7, 2013 at 16:8

I am sorry I didn't replied to you in a while there was some problem I was facing with leptonica installation... Unfortunately all of the above SS images are returned with wrong results... – Jaclynjaco 31/7, 2013 at 20:3

btw tesseract-ocr-3.02.eng.tar.gz is what I am using... is there anyother for SS images ? – Jaclynjaco 31/7, 2013 at 20:19

There's not another tesseract install that'll work better, no. Though there's always the c script for reading SSD I mentioned earlier. What are you doing that is giving wrong results for everything? – Kwakiutl 1/8, 2013 at 6:36

didnt do anything special... this is my current installation tny.cz/a23ea9ff and I run the command tesseract 000389.png out -psm 7 digits and output is 333339... which is strange and for 004200.png output is 55 333 – Jaclynjaco 1/8, 2013 at 7:27

chat.stackoverflow.com/rooms/34599/tesseract please join so we can chat there... – Jaclynjaco 1/8, 2013 at 8:53

C

4

Take a look at this project:

https://github.com/arturaugusto/display_ocr

There you can download the trained data for 7 segments font and a python script with some pre-process capabilities.

Cling answered 21/11, 2014 at 16:31 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.