converting cv::Mat for tesseract
Asked Answered
E

4

18

I'm using OpenCV to extract a subimage of a scanned document and would like to use tesseract to perform OCR over this subimage.

I found out that I can use two methods for text recognition in tesseract, but so far I wasn't able to find a working solution.

A.) How can I convert a cv::Mat into a PIX*? (PIX* is a datatype of leptonica)

Based on vasiles code below, this is essentially my current code:

 cv::Mat image = cv::imread("c:/image.png"); 
 cv::Mat subImage = image(cv::Rect(50, 200, 300, 100)); 

 int depth;
 if(subImage.depth() == CV_8U)
    depth = 8;
 //other cases not considered yet

 PIX* pix = pixCreateHeader(subImage.size().width, subImage.size().height, depth);
 pix->data = (l_uint32*) subImage.data; 

 tesseract::TessBaseAPI tess;
 STRING text; 
 if(tess.ProcessPage(pix, 0, 0, &text))
 {
    std::cout << text.string(); 
 }   

While it doesn't crash or anything, the OCR result still is wrong. It should recognize one word of my sample image, but instead it returns some non-readable characters.

The method PIX_HEADER doesn't exist, so I used pixCreateHeader, but it doesn't take the number of channels as an argument. So how can I set the number of channels?

B.) How can I use cv::Mat for TesseractRect() ?

Tesseract offers another method for text recognition with this signature:

char * TessBaseAPI::TesseractRect   (   
    const UINT8 *   imagedata,
    int     bytes_per_pixel,
    int     bytes_per_line,
    int     left,
    int     top,
    int     width,
    int     height   
)   

Currently I am using the following code, but it also returns non-readable characters (although different ones than from the code above.

char* cr = tess.TesseractRect(
           subImage.data, 
           subImage.channels(), 
           subImage.channels() * subImage.size().width, 
           0, 
           0, 
           subImage.size().width, 
           subImage.size().height);   
Eberto answered 13/11, 2011 at 22:44 Comment(0)
E
18
tesseract::TessBaseAPI tess; 
cv::Mat sub = image(cv::Rect(50, 200, 300, 100));
tess.SetImage((uchar*)sub.data, sub.size().width, sub.size().height, sub.channels(), sub.step1());
tess.Recognize(0);
const char* out = tess.GetUTF8Text();
Eberto answered 14/11, 2011 at 19:32 Comment(2)
I've never had success with this on iOS. Maybe this works when the native byte ordering matches what leptonica wants?Utimer
indeed @KaolinFire . It didn't work for me on C++ or iOS (cross platform). Same image, analysing from filename and from Mat, one was OK (filename), Mat was not. See #27001297Seigneury
S
5

For Anybody using the JavaCPP presets of OpenCV/Tesseract, here is what works

Mat img = imread("file.jpg");
Mat gray = new Mat();
cvtColor(img, gray, CV_BGR2GRAY);

// api is a Tesseract client which is initialised

api.SetImage(gray.data().asBuffer(),gray.size().width(),gray.size().height(),gray.channels(),gray.size1())
Scenic answered 12/5, 2017 at 8:18 Comment(0)
W
2
cv::Mat image = cv::imread(argv[1]);

cv::Mat gray;
cv::cvtColor(image, gray, CV_BGR2GRAY);

PIX *pixS = pixCreate(gray.size().width, gray.size().height, 8);

for(int i=0; i<gray.rows; i++) 
    for(int j=0; j<gray.cols; j++) 
        pixSetPixel(pixS, j,i, (l_uint32) gray.at<uchar>(i,j));
Winfrid answered 19/9, 2014 at 8:15 Comment(1)
what if my Mat image is binary? what should I change?Enrique
F
0

First, make a deep copy of your subImage, so that it will be stored in a coninuous memory block:

cv::Mat subImage = image(cv::Rect(50, 200, 300, 100)).clone(); 

Then, init a PIX headed (I don't know how) with the correct parameters.

// ???? Put your own constructor here. 
PIX* pix = new PIX_HEADER(width, height, channels, depth); 

OR, create it manually:

PIX pix;
pix.width = subImage.width;
...

Then set the pix data pointer to the subImage data pointer

pix.data = subImage.data;

Finally, make sure your subImage objects does not go out of scope before you finish your work with pix.

Frenum answered 14/11, 2011 at 6:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.