Recognizing image within image in C#

P

4

13

I'd like to find an image (needle) within an image (haystack).

To keep things simple I take two screenshots of my desktop. One full size (haystack) and a tiny one (needle). I then loop through the haystack image and try to find the needle image.

capture needle and haystack screenshot
loop through haystack, looking out for haystack[i] == first pixel of needle
[if 2. is true:] loop through the 2nd to last pixel of needle and compare it to haystack[i]

Expected result: the needle image is found at the correct location.

I already got it working for some coordinates/widths/heights (A).

But sometimes bits seem to be "off" and therefore no match is found (B).

What could I be doing wrong? Any suggestions are welcome. Thanks.

var needle_height = 25;
var needle_width = 25;
var haystack_height = 400;
var haystack_width = 500;

A. example input - match

var needle = screenshot(5, 3, needle_width, needle_height); 
var haystack = screenshot(0, 0, haystack_width, haystack_height);
var result = findmatch(haystack, needle);

B. example input - NO match

var needle = screenshot(5, 5, needle_width, needle_height); 
var haystack = screenshot(0, 0, haystack_width, haystack_height);
var result = findmatch(haystack, needle);

1. capture needle and haystack image

private int[] screenshot(int x, int y, int width, int height)
{
Bitmap bmp = new Bitmap(width, height, PixelFormat.Format32bppArgb);
Graphics.FromImage(bmp).CopyFromScreen(x, y, 0, 0, bmp.Size);

var bmd = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), 
  ImageLockMode.ReadOnly, bmp.PixelFormat);
var ptr = bmd.Scan0;

var bytes = bmd.Stride * bmp.Height / 4;
var result = new int[bytes];

Marshal.Copy(ptr, result, 0, bytes);
bmp.UnlockBits(bmd);

return result;
}

2. try to find a match

public Point findmatch(int[] haystack, int[] needle)
{
var firstpixel = needle[0];

for (int i = 0; i < haystack.Length; i++)
{
    if (haystack[i] == firstpixel)
    {
    var y = i / haystack_height;
    var x = i % haystack_width;

    var matched = checkmatch(haystack, needle, x, y);
    if (matched)
        return (new Point(x,y));
    }
}    
return new Point();
}

3. verify full match

public bool checkmatch(int[] haystack, int[] needle, int startx, int starty)
{
    for (int y = starty; y < starty + needle_height; y++)
    {
        for (int x = startx; x < startx + needle_width; x++)
        {
            int haystack_index = y * haystack_width + x;
            int needle_index = (y - starty) * needle_width + x - startx;
            if (haystack[haystack_index] != needle[needle_index])
                return false;
        }
    }
    return true;
}

Palikar answered 11/10, 2011 at 17:7 Comment(0)

O

2

First, there is a problem with the findmatch loop. You shouldn't just use the haystack image as an array, because you need to subtract needle's width and height from right and bottom respectively:

public Point? findmatch(int[] haystack, int[] needle)
{
    var firstpixel = needle[0];

    for (int y = 0; y < haystack_height - needle_height; y++)
        for (int x = 0; x < haystack_width - needle_width; x++)
        {
            if (haystack[y * haystack_width + x] == firstpixel)
            {
                var matched = checkmatch(haystack, needle, x, y);
                if (matched)
                    return (new Point(x, y));
            }
        }

    return null;
}

That should probably solve the problem. Also, keep in mind that there might be multiple matches. For example, if "needle" is a completely white rectangle portion of a window, there will most likely be many matches in the entire screen. If this is a possibility, modify your findmatch method to continue searching for results after the first one is found:

public IEnumerable<Point> FindMatches(int[] haystack, int[] needle)
{
    var firstpixel = needle[0];
    for (int y = 0; y < haystack_height - needle_height; y++)
        for (int x = 0; x < haystack_width - needle_width; x++)
        {
            if (haystack[y * haystack_width + x] == firstpixel)
            {
                if (checkmatch(haystack, needle, x, y))
                    yield return (new Point(x, y));
            }
        }
}

Next, you need to keep a habit of manually disposing all objects which implement IDisposable, which you have created yourself. Bitmap and Graphics are such objects, meaning that your screenshot method needs to be modified to wrap those objects in using statements:

private int[] screenshot(int x, int y, int width, int height)
{
    // dispose 'bmp' after use
    using (var bmp = new Bitmap(width, height, PixelFormat.Format32bppArgb))
    {
        // dispose 'g' after use
        using (var g = Graphics.FromImage(bmp))
        {
            g.CopyFromScreen(x, y, 0, 0, bmp.Size);

            var bmd = bmp.LockBits(
                new Rectangle(0, 0, bmp.Width, bmp.Height),
                ImageLockMode.ReadOnly,
                bmp.PixelFormat);

            var ptr = bmd.Scan0;

            // as David pointed out, "bytes" might be
            // a bit misleading name for a length of
            // a 32-bit int array (so I've changed it to "len")

            var len = bmd.Stride * bmp.Height / 4;
            var result = new int[len];
            Marshal.Copy(ptr, result, 0, len);

            bmp.UnlockBits(bmd);

            return result;
        }
    }
}

The rest of the code seems ok, with the remark that it won't be very efficient for certain inputs. For example, you might have a large solid color as your desktop's background, which might result in many checkmatch calls.

If performance is of interest to you, you might want to check different ways to speed up the search (something like a modified Rabin-Karp comes to mind, but I am sure there are some existing algorithms which ensure that invalid candidates are skipped immediately).

Ocasio answered 11/10, 2011 at 19:6 Comment(1)

That did the trick. You, sir, are a true genius. Thank you very much for your help and your effort on how to improve my coding. – Palikar 11/10, 2011 at 19:34

S

3

Instead of making two screenshots of your desktop with a time interval between them, I would take a screenshot once and cut "needle" and "haystack" from those same bitmap source. Otherwise you have the risk of a change of your desktop contents between the two moments where the screenshots are taken.

EDIT: And when your problem still occurs after that, I would try to save the image to a file and try again with that file using your debugger, giving you a reproducible situation.

Standridge answered 11/10, 2011 at 17:18 Comment(3)

Seems a little silly to say for him to cut out the object from the image which he is attempting to locate. Chicken/Egg? – Amoy 11/10, 2011 at 17:28

@Druegor: I suspect the screenshot example is just a test case for the OP where he expects a match to be found. – Standridge 11/10, 2011 at 17:32

Hello, Doc Brown. That's a valid suggestion. I thought of that, too, and can assure you that nothing changes this area of the screen. I will try nevertheless. – Palikar 11/10, 2011 at 17:35

B

2

I don't think your equations for haystack_index or needle_index are correct. It looks like you take the Scan0 offset into account when you copy the bitmap data, but you need to use the bitmap's Stride when calculating the byte position.

Also, the Format32bppArgb format uses 4 bytes per pixel. It looks like you are assuming 1 byte per pixel.

Here's the site I used to help with those equations: https://web.archive.org/web/20141229164101/http://bobpowell.net/lockingbits.aspx

Format32BppArgb: Given X and Y coordinates, the address of the first element in the pixel is Scan0+(y * stride)+(x*4). This Points to the blue byte. The following three bytes contain the green, red and alpha bytes.

Bikales answered 11/10, 2011 at 18:4 Comment(1)

I was also confused by the fact that the variable is named bytes, and there is a division by 4. But bytes is actually the length of an int array. So the algorithm compares full RGBA pixels, not individual components. – Ocasio 11/10, 2011 at 19:10

O

2

First, there is a problem with the findmatch loop. You shouldn't just use the haystack image as an array, because you need to subtract needle's width and height from right and bottom respectively:

public Point? findmatch(int[] haystack, int[] needle)
{
    var firstpixel = needle[0];

    for (int y = 0; y < haystack_height - needle_height; y++)
        for (int x = 0; x < haystack_width - needle_width; x++)
        {
            if (haystack[y * haystack_width + x] == firstpixel)
            {
                var matched = checkmatch(haystack, needle, x, y);
                if (matched)
                    return (new Point(x, y));
            }
        }

    return null;
}

That should probably solve the problem. Also, keep in mind that there might be multiple matches. For example, if "needle" is a completely white rectangle portion of a window, there will most likely be many matches in the entire screen. If this is a possibility, modify your findmatch method to continue searching for results after the first one is found:

public IEnumerable<Point> FindMatches(int[] haystack, int[] needle)
{
    var firstpixel = needle[0];
    for (int y = 0; y < haystack_height - needle_height; y++)
        for (int x = 0; x < haystack_width - needle_width; x++)
        {
            if (haystack[y * haystack_width + x] == firstpixel)
            {
                if (checkmatch(haystack, needle, x, y))
                    yield return (new Point(x, y));
            }
        }
}

Next, you need to keep a habit of manually disposing all objects which implement IDisposable, which you have created yourself. Bitmap and Graphics are such objects, meaning that your screenshot method needs to be modified to wrap those objects in using statements:

private int[] screenshot(int x, int y, int width, int height)
{
    // dispose 'bmp' after use
    using (var bmp = new Bitmap(width, height, PixelFormat.Format32bppArgb))
    {
        // dispose 'g' after use
        using (var g = Graphics.FromImage(bmp))
        {
            g.CopyFromScreen(x, y, 0, 0, bmp.Size);

            var bmd = bmp.LockBits(
                new Rectangle(0, 0, bmp.Width, bmp.Height),
                ImageLockMode.ReadOnly,
                bmp.PixelFormat);

            var ptr = bmd.Scan0;

            // as David pointed out, "bytes" might be
            // a bit misleading name for a length of
            // a 32-bit int array (so I've changed it to "len")

            var len = bmd.Stride * bmp.Height / 4;
            var result = new int[len];
            Marshal.Copy(ptr, result, 0, len);

            bmp.UnlockBits(bmd);

            return result;
        }
    }
}

The rest of the code seems ok, with the remark that it won't be very efficient for certain inputs. For example, you might have a large solid color as your desktop's background, which might result in many checkmatch calls.

If performance is of interest to you, you might want to check different ways to speed up the search (something like a modified Rabin-Karp comes to mind, but I am sure there are some existing algorithms which ensure that invalid candidates are skipped immediately).

Ocasio answered 11/10, 2011 at 19:6 Comment(1)

That did the trick. You, sir, are a true genius. Thank you very much for your help and your effort on how to improve my coding. – Palikar 11/10, 2011 at 19:34

N

0

Here are the class reference with example code that works great for my C# application for finding needle in haystack for each frame from a USB camera, in year 2018... I believe Accord is mostly a bunch of C# wrappers for fast C++ code.

Also check out the C# wrapper for Microsoft C++ DirectShow that I use to search for needle within each frame from a USB camera

Nad answered 11/4, 2018 at 16:43 Comment(0)

Recommended topics

Hot tags