How do I find Wally with Python?
Asked Answered
J

7

85

Shamelessly jumping on the bandwagon :-)

Inspired by How do I find Waldo with Mathematica and the followup How to find Waldo with R, as a new python user I'd love to see how this could be done. It seems that python would be better suited to this than R, and we don't have to worry about licenses as we would with Mathematica or Matlab.

In an example like the one below obviously simply using stripes wouldn't work. It would be interesting if a simple rule based approach could be made to work for difficult examples such as this.

At the beach

I've added the [machine-learning] tag as I believe the correct answer will have to use ML techniques, such as the Restricted Boltzmann Machine (RBM) approach advocated by Gregory Klopper in the original thread. There is some RBM code available in python which might be a good place to start, but obviously training data is needed for that approach.

At the 2009 IEEE International Workshop on MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP 2009) they ran a Data Analysis Competition: Where's Wally?. Training data is provided in matlab format. Note that the links on that website are dead, but the data (along with the source of an approach taken by Sean McLoone and colleagues can be found here (see SCM link). Seems like one place to start.

Josephinajosephine answered 13/1, 2012 at 11:28 Comment(6)
Sorry... is there any program that successfully finds Waldo in this photo? There don't seem to be any distinguishing features of the 'real' Waldo. I used to have that same Waldo book, and from what I remember there was some natural-language clue that had to be used, maybe that the real Waldo wasn't holding a cane or something. I don't see how you could programmatically find Waldo without first being able to have your program understand the natural-language clue.Sparoid
Yes you're right, sorry ... although that would be an interesting task too! I switched the image to the old "At the beach" one which also has stripes... (there's another reason for choosing this image too!)Josephinajosephine
While this question is interesting, what you are asking for is unclear . Is it an implementation of a solution? A hint at which ML library for python to use for this?Frulla
@Simon a complete implementation would probably be a bit much to ask, but a skeleton of an answer (i.e. some functions missing definitions) would be great. I'm not even sure I would load the image in (although I have seen this: https://mcmap.net/q/246603/-image-processing-in-python-closed)Josephinajosephine
github.com/jacobsevart/waldo_uchicagoAlumnus
@J.F.Sebastian cool ... post an answer?Josephinajosephine
C
66

Here's an implementation with mahotas

from pylab import imshow
import numpy as np
import mahotas
wally = mahotas.imread('DepartmentStore.jpg')

wfloat = wally.astype(float)
r,g,b = wfloat.transpose((2,0,1))

Split into red, green, and blue channels. It's better to use floating point arithmetic below, so we convert at the top.

w = wfloat.mean(2)

w is the white channel.

pattern = np.ones((24,16), float)
for i in xrange(2):
    pattern[i::4] = -1

Build up a pattern of +1,+1,-1,-1 on the vertical axis. This is wally's shirt.

v = mahotas.convolve(r-w, pattern)

Convolve with red minus white. This will give a strong response where the shirt is.

mask = (v == v.max())
mask = mahotas.dilate(mask, np.ones((48,24)))

Look for the maximum value and dilate it to make it visible. Now, we tone down the whole image, except the region or interest:

wally -= .8*wally * ~mask[:,:,None]
imshow(wally)

And we get waldo!

Contraposition answered 7/11, 2012 at 14:11 Comment(2)
I tried the beach image and it did not work very well :( Wally was in the top 6 or 7 hits, but it was not the best matching. The processing did help, because I couldn't find him on my own (with my eyes) while when I only had a bunch of small regions, it was easy.Contraposition
have you got the full source code to this? i'm getting np is not definedDrizzle
T
2

You could try template matching, and then taking down which produced the highest resemblance, and then using machine learning to narrow it more. That is also very difficult, and with the accuracy of template matching, it may just return every face or face-like image. I am thinking you will need more than just machine learning if you hope to do this consistently.

Twila answered 13/1, 2012 at 13:45 Comment(0)
R
2

maybe you should start with breaking the problem into two smaller ones:

  1. create an algorithm that separates people from the background.
  2. train a neural network classifier with as many positive and negative examples as possible.

those are still two very big problems to tackle...

BTW, I would choose c++ and open CV, it seems much more suited for this.

Rhineland answered 16/1, 2012 at 9:12 Comment(1)
If would use C++ and OpenCV then a solution in Python is just as possible. OpenCV can be used under Python.Palaver
T
2

Here's a solution using neural networks that works nicely.

The neural network is trained on several solved examples that are marked with bounding boxes indicating where Wally appears in the picture. The goal of the network is to minimize the error between the predicted box and the actual box from training/validation data.

The network above uses Tensorflow Object Detection API to perform training and predictions.

Templeton answered 12/12, 2017 at 17:17 Comment(0)
C
1

This is not impossible but very difficult because you really have no example of a successful match. There are often multiple states(in this case, more examples of find walleys drawings), you can then feed multiple pictures into an image reconization program and treat it as a hidden markov model and use something like the viterbi algorithm for inference ( http://en.wikipedia.org/wiki/Viterbi_algorithm ).

Thats the way I would approach it, but assuming you have multiple images that you can give it examples of the correct answer so it can learn. If you only have one picture, then I'm sorry there maybe another approach you need to take.

Corruption answered 17/1, 2012 at 21:25 Comment(0)
D
1

I recognized that there are two main features which are almost always visible:

  1. the red-white striped shirt
  2. dark brown hair under the fancy cap

So I would do it the following way:

search for striped shirts:

  • filter out red and white color (with thresholds on the HSV converted image). That gives you two mask images.
  • add them together -> that's the main mask for searching striped shirts.
  • create a new image with all the filtered out red converted to pure red (#FF0000) and all the filtered out white converted to pure white (#FFFFFF).
  • now correlate this pure red-white image with a stripe pattern image (i think all the waldo's have quite perfect horizontal stripes, so rotation of the pattern shouldn't be necessary). Do the correlation only inside the above mentioned main mask.
  • try to group together clusters which could have been resulted from one shirt.

If there are more than one 'shirts', to say, more than one clusters of positive correlation, search for other features, like the dark brown hair:

search for brown hair

  • filter out the specific brown hair color using the HSV converted image and some thresholds.
  • search for a certain area in this masked image - not too big and not too small.
  • now search for a 'hair area' that is just above a (before) detected striped shirt and has a certain distance to the center of the shirt.
Discovert answered 7/11, 2012 at 13:5 Comment(0)
C
1

hi if you need source code completly working past this

import numpy as np
from pylab import imshow, show
import mahotas
import mahotas.demos
wally = mahotas.demos.load('Wally')
wfloat = wally.astype(float)
r,g,b = wfloat.transpose((2,0,1))
w = wfloat.mean(2)
pattern = np.ones((24,16), float)
for i in range(2):
    pattern[i::4] = -1
v = mahotas.convolve(r-w, pattern)
mask = (v == v.max())
mask = mahotas.dilate(mask, np.ones((48,24)))
np.subtract(wally, .8*wally * ~mask[:,:,None], out=wally, casting='unsafe')
imshow(wally)
show()

y hope help you

Chalutz answered 20/5, 2023 at 21:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.