Optimizing algorithm for segmentation through image substraction

Asked 8/4, 2011 at 22:52 Answered 11/4, 2011 at 18:44

Solved image image-processing opencv computer-vision image-segmentation

for a project in OpenCV I would like to segment moving objects as good as possible with of course minimal noise.

For this I would like to use an image substraction algorithm. I already have a running program but didn't find a way today to get fair enough results.

I already have the following (grayscale) images given:

IplImage* grayScale;
IplImage* lastFrame;
IplImage* secondLastFrame;
IplImage* thirdLastFrame;

So far I have tried to substract current frames image and the last frame with cvSub(); or with cvAbsDiff(); to get the moving parts.

But unfortunately I still get a lot of noise there (i.e. due to slightly moving trees when it's windy) and if the object that moves is quite big and has a homogenic color (let's say a person in a white or black shirt), the substraction only detects the changes in the image on the left and right side of the person, not on the body itself, so one object is sometimes detected as two objects...

cvAbsDiff(this->lastFrame,grayScale,output);
cvThreshold(output,output,10,250, CV_THRESH_BINARY);
cvErode(output,output, NULL, 2);
cvDilate(output,output, NULL, 2);

To get rid of this noise I tried eroding and dilating the images with cvErode() and cvDilate() but this is quite slow and if the moving objects on the screen are small the erosion deletes quite a bit to much of the object so after delating I don't always get a good result or splitted up objects.

After this I do a cvFindContours() to get contours, check on size and if it fits draw a rectangle around the moving objects. But results are poor because often an object is split into several rectangles due to the bad segmentation.

A friend now told me I might try using more than two following frames for the substraction since this might already reduce the noise... but I don't really know what he meant by that and how I should add/substract the frames to get an image that is almost noise free and shows big enough object blobs.

Can anybody help me with that? How can I use more than one frames to get an image that has as minimum noise as possible but with big enough blobs for the moving objects? I would be thankful for any tipps...

ADDITIONS:

I have uploaded a current video right here: http://temp.tinytall.de/ Maybe somebody wants to try it there...

This is a frame from it: The left image shows my results from cvFindContours() and the right one is the segmented image on which I then try to find the contours...

segmentation result

So one large objects it works fine if they are moving fast enough... i.e. the bicycle.. but on walking people it doesn't always get a good result though... Any ideas?

Belovo answered 8/4, 2011 at 22:52 Comment(1)

en.wikipedia.org/wiki/Motion_estimation – Hum 9/4, 2011 at 6:4

Given three adjacent frames A, B, C you can get two frame differences X and Y. By combining X and Y (through, e.g. thresholding and then logical AND operation) you can reduce the effect of the noise. An unwanted side effect is that the motion-detected area will be slightly smaller than ideal (the AND operation will reduce the area).

Since image sequence motion estimation has been well researched for decades, you may want to read about more sophisticated methods of motion detection, e.g. working with motion vector fields. Google Scholar is your friend in that case.

Sandbox answered 9/4, 2011 at 0:10 Comment(3)

Thanks so far! I've tried now to substract the current frame with the last, current with second last and current with third last, then applying a binary threshold to it and adding them with cvAnd(). This works quite good but result still aren't very good yet. There isn't much noise detected but the objects themselves aren't always detected as one single blob yet (still splitted up). Anybody another idea on how I could improve results? (I've added my current code to your answer) – Belovo 9/4, 2011 at 19:1

as you've managed to eliminate most of the noise you could try this (worked for me on static images) - do a Morphology closing operation with a larger custom kernel sized to the approximate area of blobsize you expect. This should connect nearby lines on your segmented image. – Eleven 13/4, 2011 at 1:4

Thanks... I've already tried using those operations to get rid of noise, but cvErode() and cdDilate() are quite slow so I'd rather go without using those too much... – Belovo 13/4, 2011 at 7:7

It seems like you have a fixed background. One possible solution is to let the computer learn the background, eg. by taking an average over time. Then calculate the difference between the average image and the current. Differences are likely to origin from moving objects.

Blanding answered 11/4, 2011 at 18:44 Comment(2)

That sounds like a good idea... it in fact could maybe speed up the whole detecting process a bit since I don't need to calculate so many image differences and so on anymore... only once in a while update the average background image... For this I could maybe write something that takes each 10th frame and calculates a new average out of a few of those... I'll try and check back... thanks so far... – Belovo 11/4, 2011 at 19:16

This works fine... I thought of doing something like a steadily updated average image where the current frame is always added on top of the others with little influence... I found that cvRunningAvg() does this already and it works very well... segmentation results are much better. It still has some drawbacks though: moving people who stand still for a few seconds are "burned" in the average image and still are detected at the old position by threshold if they move again since the average image takes some time to "get rid of them". Any idea to improve that? – Belovo 13/4, 2011 at 7:12

Well, it is a very dicey subject. Motion estimation is quite complex. So try to find good literature and avoid inventing algorithms :)

My suggestions are:

Search for bundling images for motion estimation. Bundling is using many images to reduce noise and error rate.

Eventually, if you want to be robust, look into what is known as a Kalman filter. If you're tracking objects, you don't want them to make "infinite speed jumps" in between your frames (which is usually noise or misses). This is one C++ lib I strongly suggest Kalman filter

Finally, MonoSLAM, I'm pushing a bit :) Andrew Davison: Research

Tenia answered 9/4, 2011 at 15:44 Comment(4)

Well actually my goal is to implement a particle filter for multiple objects tracking... But in order to do that I need at least well enough segmentation results to get the position of the object for the update-phase of my tracking algorithm... I've now tried to do it with substracting multiple frames... (see comment on misha's answer) – Belovo 9/4, 2011 at 18:59

When you talk about noise, what kind of noise are you refering to? Different type of noise requires differente type of filtering. If it's camera noise (pretty much white noise) you could cvSmooth your images right after capturing them. Can you put some images, it would help being less generic. It's computer VISION after all ;) – Tenia 9/4, 2011 at 19:24

You should really check Andrew Davison's research. You need to track moving people so you need to update you tracked models which can be hard. Once you detect a "blob" moving you need to create a simple model of your object and track is accross the frames. Look for "good features to track". One easy way to do this is by working with SAD in a search window which has velocity and direction. So you will need Kalman :) – Tenia 11/4, 2011 at 15:46

I'll have a look at that... actually I didn't want to create a model based segmentation but maybe this would work out... thanks so far... – Belovo 11/4, 2011 at 19:17

Recommended topics

Hot tags