Robust Hand Detection via Computer Vision

Asked 21/12, 2011 at 16:30 Answered 16/12, 2020 at 8:46

Solved python image-processing opencv computer-vision skin

I am currently working on a system for robust hand detection.

The first step is to take a photo of the hand (in HSV color space) with the hand placed in a small rectangle to determine the skin color. I then apply a thresholding filter to set all non-skin pixels to black and all skin pixels white.

So far it works quite well, but I wanted to ask if there is a better way to solve this? For example, I found a few papers mentioning concrete color spaces for caucasian people, but none with a comparison for asian/african/caucasian color-tones.

By the way, I'm working with OpenCV via Python bindings.

Marianmariana answered 21/12, 2011 at 16:30 Comment(0)

Have you taken a look at the camshift paper by Gary Bradski? You can download it from here

I used the the skin detection algorithm a year ago for detecting skin regions for hand tracking and it is robust. It depends on how you use it.

The first problem with using color for tracking is that it is not robust to lighting variations or like you mentioned, when people have different skin tones. However this can be solved easily as mentioned in the paper by:

Convert image to HSV color space.
Throw away the V channel and consider the H and S channel and hence discount for lighting variations.
Threshold pixels with low saturation due to their instability.
Bin the selected skin region into a 2D histogram. (OpenCV"s calcHist function) This histogram now acts as a model for skin.
Compute the "backprojection" (i.e. use the histogram to compute the "probability" that each pixel in your image has the color of skin tone) using calcBackProject. Skin regions will have high values.
You can then either use meanShift to look for the mode of the 2D "probability" map generated by backproject or to detect blobs of high "probability".

Throwing away the V channel in HSV and only considering H and S channels is really enough (surprisingly) to detect different skin tones and under different lighting variations. A plus side is that its computation is fast.

These steps and the corresponding code can be found in the original OpenCV book.

As a side note, I've also used Gaussian Mixture Models (GMM) before. If you are only considering color then I would say using histograms or GMM makes not much difference. In fact the histogram would perform better (if your GMM is not constructed to account for lighting variations etc.). GMM is good if your sample vectors are more sophisticated (i.e. you consider other features) but speed-wise histogram is much faster because computing the probability map using histogram is essentially a table lookup whereas GMM requires performing a matrix computation (for vector with dimension > 1 in the formula for multi-dimension gaussian distribution) which can be time consuming for real time applications.

So in conclusion, if you are only trying to detect skin regions using color, then go with the histogram method. You can adapt it to consider local gradient as well (i.e. histogram of gradients but possibly not going to the full extent of Dalal and Trigg's human detection algo.) so that it can differentiate between skin and regions with similar color (e.g. cardboard or wooden furniture) using the local texture information. But that would require more effort.

For sample source code on how to use histogram for skin detection, you can take a look at OpenCV"s page here. But do note that it is mentioned on that webpage that they only use the hue channel and that using both hue and saturation would give better result.

For a more sophisticated approach, you can take a look at the work on "Detecting naked people" by Margaret Fleck and David Forsyth. This was one of the earlier work on detecting skin regions that considers both color and texture. The details can be found here.

A great resource for source code related to computer vision and image processing, which happens to include code for visual tracking can be found here. And not, its not OpenCV.

Hope this helps.

Reece answered 22/12, 2011 at 1:25 Comment(3)

thanks for your detailed answer. dont know if i will implement the method exactly but its a great help as it also explaings some details like ignoring the v channel - which im currently doing but without really understanding why – Marianmariana 22/12, 2011 at 11:54

I added link to a site that has lots of source code of CV and image processing applications including visual tracking wich I think you may find useful as I think skin detection could possibly be only one possible approach. Might be worth it to look at others. – Reece 22/12, 2011 at 13:11

Updating link for Detecting naked people - mfleck.cs.illinois.edu/naked.html – Cythiacyto 27/1, 2016 at 3:49

Here is a paper on adaptive gaussian mixture model skin detection that you might find interesting.

Also, I remember reading a paper (unfortunately I can't seem to track it down) that used a very clever technique, but it required that you have the face in the field of view. The basic idea was detect the person's face, and use the skin patch detected from the face to identify the skin color automatically. Then, use a gaussian mixture model to isolate the skin pixels robustly.

Finally, Google Scholar may be a big help in searching for state of the art in skin detection. It's heavily researched in adademia right now as well as used in industry (e.g., Google Images and Facebook upload picture policies).

Poachy answered 21/12, 2011 at 17:16 Comment(1)

i also thought about the idea of doing a face recognition first. unfortunately, i cant reliably presume that there is a face present. – Marianmariana 22/12, 2011 at 11:52

I have worked on something similar 2 years ago. You can try with Particle Filter (Condensation), using skin color pixels as input for initialization. It is quite robust and fast. The way I applied it for my project is at this link. You have both a presentation (slides) and the survey. If you initialize the color of the hand with the real color extracted from the hand you are going to track you shouldn't have any problems with black people.

For particle filter I think you can find some code implementation samples. Good luck.

Isothere answered 22/12, 2011 at 8:29 Comment(0)

It will be hard for you to find skin tone based on color only.
First of all, it depends strongly on the automatic white balance algorithm. For example, in this image, any person can see that the color is skin tone. But for the computer it will be blue. enter image description here

Second, correct color calibration in digital cameras is a hard thing, and it will be rarely accurate enough for your purposes.
You can see www.DPReview.com, to understand what I mean.

In conclusion, I truly believe that the color by itself can be an input, but it is not enough.

Backplate answered 21/12, 2011 at 22:45 Comment(2)

This assumes you cannot control the white-balance. On many cameras, it can be manually controlled. Also, if the skin tone is estimated via face detection that would also work in arbitrary white-balance situations. – Poachy 21/12, 2011 at 23:58

The OP asked about hand detection from HSV color channels only. Also, I can imagine a lot of situations in which you don't have faces in the image. – Backplate 2/10, 2012 at 17:57

Well my experience with the skin modeling are bad, because: 1) lightning can vary - skin segmentation is not robust 2) it will mark your face also (as other skin-like objects)

I would use machine learning techniques like Haar training, which, in my opinion, if far more better approach than modeling and fixing some constraints (like skin detection + thresholding...)

Guardianship answered 8/8, 2014 at 22:18 Comment(0)

As more robust then pixel colour you can use hand geometry model. First project model for particular gesture and the cross-correlate it with source image. Here is demo of this tchnique.

Eliciaelicit answered 16/12, 2020 at 8:46 Comment(0)

Recommended topics

Hot tags