How can I locate something on my screen quickly in Python?

Asked 23/3, 2017 at 10:49 Answered 16/2 at 21:4

Solved python image python-3.x image-processing pyautogui

I've tried using the pyautogui module and i's function that locates an image on the screen

pyautogui.locateOnScreen()

but it's processing time is about 5-10 seconds. Is there any other way for me to locate an image on the screen more quickly? Basically, I want a faster version of locateOnScreen().

Perr answered 23/3, 2017 at 10:49 Comment(0)

The official documentation says that it should take 1-2 seconds on a 1920x1080 screen. Your time seems to be a bit slow. I would try optimizing the code by doing the following:

Use grayscale unless color information needs to be considered (including grayscale=True should improve speed by approximately 30%).
Use a smaller image for detection (only a part of the desired target is required to be unique and identify the target's position).
Don't load the identifier image from local storage when searching for the target, keep it in memory.
Pass in a region argument if you know or can predict its possible location (e.g. from previous runs).

This is all described in the documentation linked above.

If this is still not fast enough, you can check the sources of pyautogui, see that 'locate on screen' uses a specific algorithm (Knuth-Morris-Pratt search algorithm) implemented in Python. Implementing this part in C may result in a quite pronounced speedup.

Carioca answered 23/3, 2017 at 12:27 Comment(4)

this answer was really helpful but after trying all these except for the last thing about implementing the search algorithm in c, I got about 10% of performance tuning! – Talca 19/10, 2018 at 11:2

@MohammadGanji You may want to try Cython to speed up the algorithm at reasonable implementation costs. – Carioca 19/10, 2018 at 11:24

Thanks, I actually got to desired speed, by zooming out the page I needed to process on, and using less confidence in matching with the image. – Talca 19/10, 2018 at 16:52

LocateOnScrren and Knuth-Morris-Pratt, how is searching image related to a substring searching algorithm? Also I see the cv2 template search in error logs. – Rheims 18/9, 2020 at 13:41

make a function and use threading confidence (requires opencv)

import pyautogui
import threading

def locate_cat():
    cat=None
    while cat is None:
        cat = pyautogui.locateOnScreen('Pictures/cat.png',confidence=.65,region=(1722,748, 200,450)
        return cat

you can use the region argument if you know the rough location of where it is on screen

there may be some instances where you can locate on screen and assign the region to a variable and use region=somevar as an argument so it looks in the same place it found it last time to help speed up the detection process.

eg:

import pyautogui

def first_find():
    front_door = None
    while front_door is None:
        front_door_save=pyautogui.locateOnScreen('frontdoor.png',confidence=.95,region=1722,748, 200,450)
        front_door=front_door_save
        return front_door_save


def second_find():
    front_door=None
    while front_door is None:
        front_door = pyautogui.locateOnScreen('frontdoor.png',confidence=.95,region=front_door_save)
        return front_door

def find_person():
    person=None
    while person is None:
        person= pyautogui.locateOnScreen('person.png',confidence=.95,region=front_door)


while True:
    first_find()
    second_find()
    if front_door is None:
        pass
    if front_door is not None:
        find_person()

Lecture answered 7/10, 2020 at 16:38 Comment(0)

I faced the same issue with pyautogui. Though it is a very convenient library, it is quite slow.

I gained a x10 speedup relying on cv2 and PIL:

def benchmark_opencv_pil(method):
    img = ImageGrab.grab(bbox=REGION)
    img_cv = cv.cvtColor(np.array(img), cv.COLOR_RGB2BGR)
    res = cv.matchTemplate(img_cv, GAME_OVER_PICTURE_CV, method)
    # print(res)
    return (res >= 0.8).any()

Where using TM_CCOEFF_NORMED worked well. (obviously, you can also adjust the 0.8 threshold)

Source : Fast locateOnScreen with Python

For the sake of completeness, here is the full benchmark:

import pyautogui as pg
import numpy as np
import cv2 as cv
from PIL import ImageGrab, Image
import time

REGION = (0, 0, 400, 400)
GAME_OVER_PICTURE_PIL = Image.open("./balloon_fight_game_over.png")
GAME_OVER_PICTURE_CV = cv.imread('./balloon_fight_game_over.png')


def timing(f):
    def wrap(*args, **kwargs):
        time1 = time.time()
        ret = f(*args, **kwargs)
        time2 = time.time()
        print('{:s} function took {:.3f} ms'.format(
            f.__name__, (time2-time1)*1000.0))

        return ret
    return wrap


@timing
def benchmark_pyautogui():
    res = pg.locateOnScreen(GAME_OVER_PICTURE_PIL,
                            grayscale=True,  # should provied a speed up
                            confidence=0.8,
                            region=REGION)
    return res is not None


@timing
def benchmark_opencv_pil(method):
    img = ImageGrab.grab(bbox=REGION)
    img_cv = cv.cvtColor(np.array(img), cv.COLOR_RGB2BGR)
    res = cv.matchTemplate(img_cv, GAME_OVER_PICTURE_CV, method)
    # print(res)
    return (res >= 0.8).any()


if __name__ == "__main__":

    im_pyautogui = benchmark_pyautogui()
    print(im_pyautogui)

    methods = ['cv.TM_CCOEFF', 'cv.TM_CCOEFF_NORMED', 'cv.TM_CCORR',
               'cv.TM_CCORR_NORMED', 'cv.TM_SQDIFF', 'cv.TM_SQDIFF_NORMED']


    # cv.TM_CCOEFF_NORMED actually seems to be the most relevant method
    for method in methods:
        print(method)
        im_opencv = benchmark_opencv_pil(eval(method))
        print(im_opencv)

And the results show a x10 improvement.

benchmark_pyautogui function took 175.712 ms
False
cv.TM_CCOEFF
benchmark_opencv_pil function took 21.283 ms
True
cv.TM_CCOEFF_NORMED
benchmark_opencv_pil function took 23.377 ms
False
cv.TM_CCORR
benchmark_opencv_pil function took 20.465 ms
True
cv.TM_CCORR_NORMED
benchmark_opencv_pil function took 25.347 ms
False
cv.TM_SQDIFF
benchmark_opencv_pil function took 23.799 ms
True
cv.TM_SQDIFF_NORMED
benchmark_opencv_pil function took 22.882 ms
True

Tackett answered 15/6, 2021 at 8:14 Comment(0)

All the aswers who suggest using region to create some speed up, are incorrect, as pysqueeze(print library used by pyautogui) discarts the region argument.

        # the locateAll() function must handle cropping to return accurate coordinates,
        # so don't pass a region here.
        screenshotIm = screenshot(region=None)
        retVal = locate(image, screenshotIm, **kwargs)

Source code of pyscreeze where region is discarted

Uppercut answered 16/2 at 21:4 Comment(0)

-1

If you are looking for image recognition you can use Sikuli. Check the Hello World tutorial.

Brackish answered 23/3, 2017 at 11:20 Comment(0)

Recommended topics

Hot tags