numpy template matching using matrix multiplications

Asked 3/10, 2012 at 19:41 Answered 4/10, 2012 at 15:24

I am trying to match a template with a binary image (only black and white) by shifting the template along the image. And return the minimum distance between the template and the image with the corresponding position on which this minimum distance did occur. For example:

img:

0 1 0
0 0 1
0 1 1

template:

0 1
1 1

This template matches the image best at position (1,1) and the distance will then be 0. So far things are not too difficult and I already got some code that does the trick.

def match_template(img, template):
    mindist = float('inf')
    idx = (-1,-1)
    for y in xrange(img.shape[1]-template.shape[1]+1):
        for x in xrange(img.shape[0]-template.shape[0]+1):
        #calculate Euclidean distance
        dist = np.sqrt(np.sum(np.square(template - img[x:x+template.shape[0],y:y+template.shape[1]])))
        if dist < mindist:
            mindist = dist
            idx = (x,y)
    return [mindist, idx]

But for images of the size I need (image among 500 x 200 pixels and template among 250 x 100) this already takes approximately 4.5 seconds, which is way too slow. And I know the same thing can be done much quicker using matrix multiplications (in matlab I believe this can be done using im2col and repmat). Can anyone explain me how to do it in python/numpy?

btw. I know there is an opencv matchTemplate function that does exactly what I need, but since I might need to alter the code slightly later on I would prefer a solution which I fully understand and can alter.

Thanks!

edit: If anyone can explain me how opencv does this in less than 0.2 seconds that would also be great. I have had a short look at the source code, but those things somehow always look quite complicated to me.

edit2: Cython code

import numpy as np
cimport numpy as np

DTYPE = np.int
ctypedef np.int_t DTYPE_t

def match_template(np.ndarray img, np.ndarray template):
    cdef float mindist = float('inf')
    cdef int x_coord = -1
    cdef int y_coord = -1
    cdef float dist
    cdef unsigned int x, y
    cdef int img_width = img.shape[0]
    cdef int img_height = img.shape[1]
    cdef int template_width = template.shape[0]
    cdef int template_height = template.shape[1]
    cdef int range_x = img_width-template_width+1
    cdef int range_y = img_height-template_height+1
    for y from 0 <= y < range_y:
        for x from 0 <= x < range_x:
            dist = np.sqrt(np.sum(np.square(template - img[ x:<unsigned int>(x+template_width), y:<unsigned int>(y+template_height) ]))) #calculate euclidean distance
            if dist < mindist:
                mindist = dist
                x_coord = x
                y_coord = y
    return [mindist, (x_coord,y_coord)]

img = np.asarray(img, dtype=DTYPE)
template = np.asarray(template, dtype=DTYPE)
match_template(img, template)

Mckee answered 3/10, 2012 at 19:41 Comment(2)

Different problem, but possibly the same solution here. – Aflame 3/10, 2012 at 21:39

@Aflame The solutions there all make use of the fact that only a 3x3 sliding window is needed, but I need something that works for all sizes of templates. – Mckee 4/10, 2012 at 8:20

One possible way of doing what you want is via convolution (which can be brute force or FFT). Matrix multiplications AFAIK won't work. You need to convolve your data with the template. And find the maximum (you'll also need to do some scaling to make it work properly).

xs=np.array([[0,1,0],[0,0,1],[0,1,1]])*1.
ys=np.array([[0,1],[1,1]])*1.
print scipy.ndimage.convolve(xs,ys,mode='constant',cval=np.inf)
>>> array([[  1.,   1.,  inf],
       [  0.,   2.,  inf],
       [ inf,  inf,  inf]])

print scipy.signal.fftconvolve(xs,ys,mode='valid') 
>>> array([[ 1.,  1.],
           [ 0.,  2.]])

Figuration answered 4/10, 2012 at 15:24 Comment(5)

I tried to run a convolution like yours for images of the size I need, but it took more than 2 minutes to finish. Probably because I don't really use a kernel (e.g. 3x3) but have a template (e.g. 250 x 100) to match the image with. – Mckee 4/10, 2012 at 16:12

Then you should use the FFT convolution (I added the code to added to my answer) – Figuration 4/10, 2012 at 17:23

The fast fourier transform convolve does give me some answer in 0.5 seconds, but I don't exactly understand how to intrepret it. In case of the larger images, I get a matrix with all very large values, which don't make any sense to me. Also it is slower than the opencv matchTemplate function and I would still be unable to alter the code as I wanted to.If anyone could explain me how it's possible that opencv does it so much quicker, that would be nice too. – Mckee 4/10, 2012 at 20:46

Regarding how to interpret the results, I'll forward you to here en.wikipedia.org/wiki/Matched_filter. Regarding the speed of opencv, I looked at the docs, they have multiple methods opencv.willowgarage.com/documentation/cpp/object_detection.html which obviously will have different performance. CV_TM_CCORR is the method equivalent to the convolution that I suggested. – Figuration 4/10, 2012 at 21:17

But all opencv methods within matchTemplate are much (250-500 times) quicker than what I do above, it can not just be the method of matching, it also must be the way they implemented it. Anyway thanks for your help so far, but unfortunately the convolution methods you suggested are no better than using an opencv function for me. – Mckee 4/10, 2012 at 21:55

There may be a fancy way to get this done using pure numpy/scipy magic. But it might be easier (and more understandable when you look at the code in the future) to just drop into Cython to get this done. There's a good tutorial for integrating Cython with numpy at http://docs.cython.org/src/tutorial/numpy.html.

EDIT: I did a quick test with your Cython code and it ran ~15 sec for a 500x400 img with a 100x200 template. After some tweaks (eliminating the numpy method calls and numpy bounds checking), I got it down under 3 seconds. That may not be enough for you, but it shows the possibility.

import numpy as np
cimport numpy as np
cimport cython
from libc.math cimport sqrt

DTYPE = np.int
ctypedef np.int_t DTYPE_t

@cython.boundscheck(False)
def match_template(np.ndarray[DTYPE_t, ndim=2] img, np.ndarray[DTYPE_t, ndim=2] template):
    cdef float mindist = float('inf')
    cdef int x_coord = -1
    cdef int y_coord = -1
    cdef float dist
    cdef unsigned int x, y
    cdef int img_width = img.shape[0]
    cdef int img_height = img.shape[1]
    cdef int template_width = template.shape[0]
    cdef int template_height = template.shape[1]
    cdef int range_x = img_width-template_width+1
    cdef int range_y = img_height-template_height+1
    cdef DTYPE_t total
    cdef int delta
    cdef unsigned int j, k, j_plus, k_plus
    for y from 0 <= y < range_y:
        for x from 0 <= x < range_x:
            #dist = np.sqrt(np.sum(np.square(template - img[ x:<unsigned int>(x+template_width), y:<unsigned int>(y+template_height) ]))) #calculate euclidean distance
            # Do the same operations, but in plain C
            total = 0
            for j from 0 <= j < template_width:
                j_plus = <unsigned int>x + j
                for k from 0 <= k < template_height:
                    k_plus = <unsigned int>y + k
                    delta = template[j, k] - img[j_plus, k_plus]
                    total += delta*delta
            dist = sqrt(total)
            if dist < mindist:
                mindist = dist
                x_coord = x
                y_coord = y
    return [mindist, (x_coord,y_coord)]

Aflame answered 4/10, 2012 at 14:41 Comment(5)

When I started this project (not a long time ago), I actually doubted whether I would use python or C. Although I've never heard of cython before, this might just be the perfect mix for me. Although I'm still not sure if this completely solves my problem, I will definitely have a look at it. – Mckee 4/10, 2012 at 15:16

Cython is a separate language from Python, but with a high degree of overlapping syntax (all valid Python is valid Cython). Cython adds extra syntax to specify type so it can compile functions and loops down to native C equivalents. It requires a bit of intuition to identify which pieces of code still live in the Python realm and which are sufficiently typed to be converted to C-only. For dealing with nested for-loops, the speedup will be significant. And it integrates very well with numpy arrays. – Aflame 4/10, 2012 at 15:24

I tried to speed up the code using cython, but ended up with no speed improvement at all. I must be doing something wrong, unless the standard numpy matrix multiplications don't get any quicker in cython (I did use cimport). If anyone knows how to translate the above code correctly into cython I'm willing to give it another shot, otherwise I might need to think about other tricks to improve the speed. – Mckee 4/10, 2012 at 20:17

edited my post and added cython code, probably far from perfect but I guess calculating the dist is the biggest problem. – Mckee 6/10, 2012 at 11:12

When I ran your cython code it took 7 seconds, so it actually takes longer then the original python code. It only gets stranger to me. – Mckee 13/10, 2012 at 16:0

Recommended topics

Hot tags