OCR engines designed for screen-reading
Asked Answered
B

4

10

Are there any OCR engines designed for identifying text in screen-captured images rather than scanned text? I have a project where I need to retrieve and identify text in an application, and none of the OCR engines I've tried so far have faired well with screenshots.

Ideally the engine should work well with color and with background noise, although I can make some allowances if nothing like that is available.

It will need to be .NET compatible; either written in .NET or having a .NET-callable API.

Bolyard answered 27/7, 2010 at 15:8 Comment(2)
What's the difference between scanned text and a screen shot?Sena
The text of the screen shot is guaranteed to be on straight lines, but also in color, with colored background noise. I'm looking to see if there's an OCR engine specifically designed to read from screenshots.Bolyard
C
5

I've found Tesseract OCR to be pretty solid for an open source project. I've found that it can even read and decode simple captchas, like Megaupload's. I'd think with a little tweaking this could work pretty well.

The only pain is that it only accepts uncompressed TIFF images, which can be annoying.

EDIT: Philip Daubmeier already found a .NET integration, but below is code to convert a Bitmap to uncompressed TIFF.

private void ConvertBitmapToTIF(Bitmap convert)
{
    ImageCodecInfo codecInfo = GetEncoderInfo("image/tiff");
    System.Drawing.Imaging.Encoder encodeCom = System.Drawing.Imaging.Encoder.Compression;
    System.Drawing.Imaging.Encoder encodeBPP = System.Drawing.Imaging.Encoder.ColorDepth;

    EncoderParameters parms = new EncoderParameters(2);
    EncoderParameter param0 = new EncoderParameter(encodeCom, (long)EncoderValue.CompressionNone);
    EncoderParameter param1 = new EncoderParameter(encodeBPP, 8L);
    parms.Param[0] = param0;
    parms.Param[1] = param1;

    convert.Save("output.tif", codecInfo, parms);
}

This saves to a file, but the Bitmap.Save method can write to a stream also.

Cosh answered 27/8, 2010 at 2:58 Comment(1)
Just found there is already a .net integration: pixel-technology.com/freeware/tessnet2Cryotherapy
S
4

Usually OCR technolgy is tuned to work with scanned text, which is at at least 200 dpi, however 300 dpi is recommended for reliable OCR quality. Thus you need to put some efforts into tweaking settings and everything to make it work on screen text, which is typically considered to be something near to 96 dpi.

ABBYY has screen shot OCR software: http://www.abbyy.com/screenshot_reader/ which proves that its technology is able to work in this conditions well. I use it, it just works. Thus you may want to contact ABBYY for OCR SDK: http://www.abbyy.com/ocr_sdk/ (can be used from .NET)

It is not cheap, but it works. Disclaimer: I work for ABBYY

Shielashield answered 5/8, 2010 at 11:20 Comment(0)
S
1

You're essentially looking for the CAPTCHA circumvention tools various researchers have tried, some with success.

Another approach would be to use smoothing algorithms to interpolate 96 DPI captures and convert them to 300 DPI (eg, photoshop it), then use standard OCR tools.

Smuggle answered 29/8, 2010 at 4:46 Comment(1)
I'm not looking for CAPTCHA solvers - none of the text is going to be scrambled in that way - but this will help nonetheless. =)Bolyard
B
0

Use the first answer (OCR software) and for the screen capture you could probably send a PRNTSCRN (printscreen) character and then CONVERT the content of the clipboard(bmp) into a tiff.

hope this help you a little more into your venture

Berne answered 31/8, 2010 at 9:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.