how to detect orientation of a scanned document?
Asked Answered
V

1

6

I'd to detect and, if necessary, correct the orientation of a scanned document image. I am already able to deskew documents, however it still might occur, that a document is upside down and it needs to be rotated by 180°.

Using tesseract's layout analysis feature it should be possible to determine a document's orientation using this code:

    tesseract::TessBaseAPI api; 
    api.Init(argv[0], "eng");
    api.SetImage(img); 
    api.SetPageSegMode(tesseract::PSM_AUTO_OSD); 
    tesseract::PageIterator* it = api.AnalyseLayout();

    tesseract::Orientation orient;
    tesseract::WritingDirection dir;
    tesseract::TextlineOrder order; 
    float f;
    it->Orientation(&orient, &dir, &order, &f); 

    if(orient == tesseract::Orientation::ORIENTATION_PAGE_UP)
        std::cout << "Page Up\t"; 
    else if(orient == tesseract::Orientation::ORIENTATION_PAGE_LEFT)
        std::cout << "Page Left\t"; 
    else if(orient == tesseract::Orientation::ORIENTATION_PAGE_DOWN)
        std::cout << "Page Down\t"; 
    else if(orient == tesseract::Orientation::ORIENTATION_PAGE_RIGHT)
        std::cout << "Page Right\t";

However the code doesn't seems to work correctly as it always returns ORIENTATION_PAGE_UP when a document is in portrait format and ORIENTATION_PAGE_LEFT when it is in landscape format. (ORIENTATION_PAGE_DOWN and ORIENTATION_PAGE_RIGHT can be used, but are never returned).

A.) Is there anything wrong with the code above?

B.) How else can I determine a documents orientation?

Vindicate answered 17/11, 2011 at 20:7 Comment(0)
C
5

What about just running your detection evaluate the detection rate and then doing the same thing flipped ? The better rate gives the right direction.

Cameral answered 17/11, 2011 at 20:34 Comment(2)
unless a page is turned upside down, tesseract is still able to recognize the text correctly. So it is able to find the correct orientation somehow. Anyway, thanks for the suggestions I'll try this: If I extract a small area and perform OCR four times, I just have to find the invalid string and know when it is turned upside down.Vindicate
Can you explain with some examples!!Gossamer

© 2022 - 2024 — McMap. All rights reserved.