I'm trying to create a real time OCR in python using mss
and pytesseract
.
So far, I've been able to capture my entire screen which has a steady FPS of 30. If I wanted to capture a smaller area of around 500x500, I've been able to get 100+ FPS.
However, as soon as I include this line of code, text = pytesseract.image_to_string(img)
, boom 0.8 FPS. Is there any way I could optimise my code to get a better FPS? Also the code is able to detect text, its just extremely slow.
from mss import mss
import cv2
import numpy as np
from time import time
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\\Users\\Vamsi\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe'
with mss() as sct:
# Part of the screen to capture
monitor = {"top": 200, "left": 200, "width": 500, "height": 500}
while "Screen capturing":
begin_time = time()
# Get raw pixels from the screen, save it to a Numpy array
img = np.array(sct.grab(monitor))
# Finds text from the images
text = pytesseract.image_to_string(img)
# Display the picture
cv2.imshow("Screen Capture", img)
# Display FPS
print('FPS {}'.format(1 / (time() - begin_time)))
# Press "q" to quit
if cv2.waitKey(25) & 0xFF == ord("q"):
cv2.destroyAllWindows()
break
ret, img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
before pytesseract takes in the image, but it still has a slow performace under 1 FPS. Is there anything that I'm doing wrong? – Babeimg = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
, and above I changed it to(thresh, bw_img) = cv2.threshold(img, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
It's still very slow ~1FPS – Babe