Pytesseract is very slow for real time OCR, any way to optimise my code?
Asked Answered
B

4

11

I'm trying to create a real time OCR in python using mss and pytesseract.

So far, I've been able to capture my entire screen which has a steady FPS of 30. If I wanted to capture a smaller area of around 500x500, I've been able to get 100+ FPS.

However, as soon as I include this line of code, text = pytesseract.image_to_string(img), boom 0.8 FPS. Is there any way I could optimise my code to get a better FPS? Also the code is able to detect text, its just extremely slow.

from mss import mss
import cv2
import numpy as np
from time import time
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\\Users\\Vamsi\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe'

with mss() as sct:
    # Part of the screen to capture
    monitor = {"top": 200, "left": 200, "width": 500, "height": 500}

    while "Screen capturing":
        begin_time = time()

        # Get raw pixels from the screen, save it to a Numpy array
        img = np.array(sct.grab(monitor))

        # Finds text from the images
        text = pytesseract.image_to_string(img)

        # Display the picture
        cv2.imshow("Screen Capture", img)

        # Display FPS
        print('FPS {}'.format(1 / (time() - begin_time)))

        # Press "q" to quit
        if cv2.waitKey(25) & 0xFF == ord("q"):
            cv2.destroyAllWindows()
            break
Babe answered 23/2, 2021 at 14:8 Comment(3)
Recognising text from images is very cpu intensive - as a first step I would look at binarizing the input that is passed into image_to_string - this can speed up text recognition significantly.Tinytinya
@Tinytinya So i added ret, img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY) before pytesseract takes in the image, but it still has a slow performace under 1 FPS. Is there anything that I'm doing wrong?Babe
I also changed the image to grayscale using this img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), and above I changed it to (thresh, bw_img) = cv2.threshold(img, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU) It's still very slow ~1FPSBabe
C
2

After looking at the pytesseract code I see that it convert the image format and save locally before feeding it to tesseract. By changing from PNG to JPG i got a 3x speedup (9.5 to 3seconds/image). I guess there is more optimization that could be done in the Python code part.

Chirm answered 28/12, 2022 at 11:44 Comment(0)
O
1

You can use the “easyocr”, a lightweight python package which can be used for OCR applications. It is very fast, reliable and has access to over 70+ languages, including English, Chinese, Japanese, Korean, Hindi, and many more are being added.

"pip install easyocr"

Check this out: https://huggingface.co/spaces/tomofi/EasyOCR

Odisodium answered 1/10, 2022 at 17:58 Comment(0)
B
1

I had this same problem. OCRing a document in the native desktop environment, took 5 seconds and the same document when running in docker on the same machine, took 200+ seconds.

The solution turned out to be adding:

ENV OMP_THREAD_LIMIT=1

to my dockerfile.

This disable multithreading in tesseract. Why it makes it faster in docker, I couldn't tell you, but it brings it down close to native performance for me.

Bratton answered 23/6, 2024 at 3:40 Comment(0)
M
0

pytesseract is not efficient "by default", as it wraps tesseract executable, it save temporary files to disk etc... If you are serious about performance you need to use tesseract API directly (e.g. via tesserocr or by creating custom API wrapper)

Mealymouthed answered 23/2, 2021 at 16:58 Comment(6)
I've been trying to install tesserocr for the past few hours and it's so painful for windows 10. I'm using pycharm and the tesserocr package just does not want to install.Babe
I know: on linux or mac it should be easy. IMO there must be bigger Windows user group to make support for recent python version.Mealymouthed
try this: sk-spell.sk.cx/…Mealymouthed
I did some comparative tests between pytesseract and tesserocr, but the performance is not as different as said.Viola
Compared pytesseract to tesserocr on Linux and witnessed almost identical runtimes.Jacalynjacamar
Yes on modern hardware(ssd disk, virtualized env) difference is not so big. But in cases of real-time OCR each time storing input image to disk, initialize tesseract, store output to disk, read output from disk is waste of time regardless we speak about milliseconds. It does not mean that pyttesseract is bad. It wraps tesseract executable (instead of the library) which has pros and cons...Mealymouthed

© 2022 - 2025 — McMap. All rights reserved.