(This came out a bit long, but it's mostly an hardly hard to understand explanation :)
For a project I have I need to recognise objects that, in general form, look like this -
Inside a bigger image that contains different shapes, like this one -
As you can see, the object I'm looking for is a red line with crosses on each side (there's 5 of them in that last picture). I have a bank of around 4,000 images in which I need to find the object, some of them contain these objects, and some of them don't, like this image -
After doing some research, I've figured using haar cascades and openCV is the way to go, so I wrote a script that goes through all of the aforementioned 4,000 images and extracts separates contours, like the first image in this question.
Then, I went through the many contours, grabbed around 150 of them (that is, 150 files that contain just the object I need, similar to the first image) and around 180 images that do not contain the object I need (similar to the third picture here).
Then I started the training process, using several tutorials, but mainly this one.
While doing so, I encountered a problem - as you can see, the images of the desired double-crossed object are not the same size, and don't even have the same scale (as they can appear in any angle - horizontally, diagonally, etc..).
At first I tried using the images with the different dimension, but that caused errors in the training process, so, to work around that, I've changed all of the positive images' dimension to be 350x350 (the biggest scale of one of the objects). Just to be clear - I did not resize the images - I just added white space around to make all of the images to be 350x350 pixels.
Then I went through the training process, as suggested in the tutorial - I created samples (width - 24, height - 24) and created a cascade xml file, which turned out to be not very big (45kb).
Now, I know that 150 positive images and 180 negative ones are not a lot, but I wanted to at least get a proof-of-concept working before I filtered more images and put more time into it.
When the cascade.xml file was done, I tried to use it to locate some objects in some images (using cv2.CascadeClassifier('cascade.xml')
and cascade.detectMultiScale(img)
but every try returned zero results.
At last I even tried to locate an object in one of the positive images (which contained nothing but one of the desired objects), but it too returned zero results.
I tried tweaking the parameters of cascade.detectMultiScale(img)
and currently I'm training a cascade file with 36x36 samples, but I'm not confident it will work.
Since I'm pretty new to this stuff, I was wondering what I'm doing wrong, and I thought I'll ask here.
More specifically:
- Do you think the use of haar is right in this context? Should I use other method of objects recognition?
- Could the positive images dimensions be the source of problem? If so, how can I go about it?
If I didn't include some important data, please let me know I'll post it.
Thank you very much for your help, Dan