Highlighting specific text in an image using python

Asked 9/1, 2019 at 17:51 Answered 29/4, 2024 at 10:34

Solved python-3.x computer-vision ocr python-tesseract

I want to highlight specific words/sentences in a website screenshot.

Once the screenshot is taken, I extract the text using pytesseract and cv2. That works well and I can get text and data about it.

import pytesseract
import cv2


if __name__ == "__main__":
    img = cv2.imread('test.png')
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    result = pytesseract.image_to_data(img, lang='eng', nice=0, output_type=pytesseract.Output.DICT)
    print(result)

Using the results object I can find needed words and sentences.

The question is how to go back to the image and highlight those word?

Should I be looking at other libraries or there is a way to get pixel values and then highlight the text?

Ideally, I would like to get start and end coordinates of each word, how can that be done?

Bullard answered 9/1, 2019 at 17:51 Comment(0)

You can use pytesseract.image_to_boxes method to get the bounding box position of each character identified in your image. You can also use the method to draw bounding box around some specific characters if you want. Below code draws rectangles around my identified image.

import cv2
import pytesseract
import matplotlib.pyplot as plt

filename = 'sf.png'

# read the image and get the dimensions
img = cv2.imread(filename)
h, w, _ = img.shape # assumes color image

# run tesseract, returning the bounding boxes
boxes = pytesseract.image_to_boxes(img)use
print(pytesseract.image_to_string(img)) #print identified text

# draw the bounding boxes on the image
for b in boxes.splitlines():
    b = b.split()
    cv2.rectangle(img, ((int(b[1]), h - int(b[2]))), ((int(b[3]), h - int(b[4]))), (0, 255, 0), 2)

plt.imshow(img)

Celestina answered 10/1, 2019 at 9:10 Comment(5)

Awesome, that is very helpful. I am still trying to find the answer for the structure of the output. Do you have any reference? – Bullard 10/1, 2019 at 18:17

@Bullard Meaning of "structure of output"? Do you mean the output of pytesseract.image_to_boxes ? – Celestina 11/1, 2019 at 14:43

the output of pytesseract.image_to_boxes and pytesseract.image_to_data I had to glare at it for few hours to figure out the structure and the meaning of the number. Like for example, thatword_num restarts enumeration on each line_num and each line_num is restarted from ablock_num. Still not sure what thepar_num means. – Bullard 11/1, 2019 at 17:12

@Bullard Were you able to highlight specific words back in the image, if so can you help me too? – Sarajane 16/11, 2020 at 12:45

Hey, I'm actually working on a very similar task, I'm just using interested in a specific text that i managed to create a mask around it... I want to get back the color of the text, but so far i hope i could calculate average color of the characters rectangle and try to compare those values. Thanks for your solution 🤗 – Fawnia 12/11, 2021 at 7:44

I hope this gives correct solution for the question to highlight the text from the image. Anyway this is late but useful for someone.

import cv2,re
import pytesseract


filename = 'quotes1.jpg'
text_search = "happiness" 

# read the image 
img = cv2.imread(filename)

# run tesseract, returning the bounding boxes
data = pytesseract.image_to_data(img, output_type='dict')
print(data)
boxes = len(data['level'])

for i in range(boxes):
    if re.search(text_search , data['text'][i] , re.IGNORECASE):
        overlay = img.copy()
        (x, y, w, h) = data['left'][i], data['top'][i], data['width'][i], data['height'][i]
        cv2.rectangle(overlay, (data['left'][i], data['top'][i]), (data['left'][i]+data['width'][i], data['top'][i]+data['height'][i]),(255,0,0), -1) 
        alpha = 0.4  # Transparency factor.
        # Following line overlays transparent rectangle over the image
        img_new = cv2.addWeighted(overlay, alpha, img, 1 - alpha, 0)
cv2.imwrite("output.jpg",img_new)

Output:

Thank You!

Nucleo answered 29/4, 2024 at 10:34 Comment(0)

Recommended topics

Hot tags