Computer Vision: Masking a human hand

Asked 7/2, 2013 at 13:14 Answered 8/2, 2013 at 10:49

I'd like to detect my hand from a live video stream and create a mask of my hand. However I'm reaching quite a poor result, as you can see from the picture.

My goal is to track the hand movement, so what I did was convert the video stream from BGR to HSV color space then I thresholded the image in order to isolate the color of my hand, then I tried to find the contours of my hand although the final result isn't quite what I wanted to achieve.

How could I improve the end result?

import cv2
import numpy as np

cam = cv2.VideoCapture(1)
cam.set(3,640)
cam.set(4,480)
ret, image = cam.read()

skin_min = np.array([0, 40, 150],np.uint8)
skin_max = np.array([20, 150, 255],np.uint8)    
while True:
    ret, image = cam.read()

    gaussian_blur = cv2.GaussianBlur(image,(5,5),0)
    blur_hsv = cv2.cvtColor(gaussian_blur, cv2.COLOR_BGR2HSV)

#threshould using min and max values
    tre_green = cv2.inRange(blur_hsv, skin_min, skin_max)
#getting object green contour
    contours, hierarchy = cv2.findContours(tre_green,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

#draw contours
    cv2.drawContours(image,contours,-1,(0,255,0),3)

    cv2.imshow('real', image)
    cv2.imshow('tre_green', tre_green)   

    key = cv2.waitKey(10)
    if key == 27:
        break

Here the link with the pictures: https://picasaweb.google.com/103610822612915300423/February7201303. New link with image plus contours, mask, and original. https://picasaweb.google.com/103610822612915300423/February7201304

And here's a sample picture from above:

Sample picture of a torso with arms ... and a hand

Thai answered 7/2, 2013 at 13:14 Comment(21)

Include the sample video you are having trouble with, otherwise it is pointless to try to guess what you are actually working with. – Supple 7/2, 2013 at 13:20

I can't upload the pictures, since I don't have enough reputation points :( – Thai 7/2, 2013 at 13:21

Just include a link. And include link to the /video/, not individual frames. – Supple 7/2, 2013 at 13:22

It's a live stream. Do you want a sample of it? – Thai 7/2, 2013 at 13:23

I don't know what is that, just include a sample of what you are working with. – Supple 7/2, 2013 at 13:24

Probably live stream, I guess. – Cheops 7/2, 2013 at 13:27

Sorry that was a typo I just put the link with the images. – Thai 7/2, 2013 at 13:28

I added one of the images (the smaller one). – Unrig 7/2, 2013 at 13:32

@user1979084 by live stream do you mean you are recording it with some camera ? You can easily create a video from that using opencv. Do that and include a link to it. – Supple 7/2, 2013 at 13:33

@Supple I'm not recording it's done on the fly... – Thai 7/2, 2013 at 13:35

@user1979084 ... what I'm telling you is to record it ... – Supple 7/2, 2013 at 13:35

@Supple I don't see why recording it would be helpful. – Valleau 7/2, 2013 at 13:48

This isn't really a SO type of question; you're straying into the woods of complex algorithm questions. – Valleau 7/2, 2013 at 13:49

@katrielalex I'm not really asking for a better algorithm, I just don't know which one should I use to obtain a better mask... or perhaps better HSV values for recognising skin colors in opencv. – Thai 7/2, 2013 at 13:52

@katrielalex huh ? It is helpful since I (or anyone interested in giving an answer) have the actual data available, it is completely useful. – Supple 7/2, 2013 at 13:52

@Supple I can post the pictures if you like with the original(not_modified)? would that be helpfull? – Thai 7/2, 2013 at 13:56

@user1979084 better than nothing, but I don't get why you just don't record it and post a link for that. If that is because you don't know how to use the VideoWriter in opencv, check https://mcmap.net/q/244741/-opencv-python-bindings-how-do-i-capture-an-image-from-memory (it almost solves this recording issue, the only difference is that you will read frames and then write as in this linked answer). – Supple 7/2, 2013 at 13:58

@Supple I added the link with the origina image, image+contours, mask – Thai 7/2, 2013 at 14:10

@Supple and btw, you are right I don't know know how to use VideoWriter :( – Thai 7/2, 2013 at 14:12

@user1979084 the link doesn't work, page not found. – Supple 7/2, 2013 at 15:14

@mmgp, sorry my bad please try now... – Thai 7/2, 2013 at 15:29

There are many ways to perform pixel-wise threshold to separate "skin pixels" from "non-skin pixels", and there are papers based on virtually any colorspace (even with RGB). So, my answer is simply based on the paper Face Segmentation Using Skin-Color Map in Videophone Applications by Chai and Ngan. They worked with the YCbCr colorspace and got quite nice results, the paper also mentions a threshold that worked well for them:

(Cb in [77, 127]) and (Cr in [133, 173])

The thresholds for the Y channel are not specified, but there are papers that mention Y > 80. For your single image, Y in the whole range is fine, i.e. it doesn't matter for actually distinguishing skin.

Here is the input, the binary image according to the thresholds mentioned, and the resulting image after discarding small components.

enter image description here

import sys
import numpy
import cv2

im = cv2.imread(sys.argv[1])
im_ycrcb = cv2.cvtColor(im, cv2.COLOR_BGR2YCR_CB)

skin_ycrcb_mint = numpy.array((0, 133, 77))
skin_ycrcb_maxt = numpy.array((255, 173, 127))
skin_ycrcb = cv2.inRange(im_ycrcb, skin_ycrcb_mint, skin_ycrcb_maxt)
cv2.imwrite(sys.argv[2], skin_ycrcb) # Second image

contours, _ = cv2.findContours(skin_ycrcb, cv2.RETR_EXTERNAL, 
        cv2.CHAIN_APPROX_SIMPLE)
for i, c in enumerate(contours):
    area = cv2.contourArea(c)
    if area > 1000:
        cv2.drawContours(im, contours, i, (255, 0, 0), 3)
cv2.imwrite(sys.argv[3], im)         # Final image

Lastly, there are a quite decent amount of papers that do not rely on individual pixel-wise classification for this task. Instead, they start from a base of labeled images that are known to contain either skin pixels or non-skin pixels. From that they train, for example, a SVM and then distinguish other inputs based on this classifier.

Supple answered 7/2, 2013 at 16:44 Comment(2)

@mmpg wow this really works astonishingly well! thanks. i had a bit of trouble while using cv2.inRange() but that got solved by using min_YCrCb = numpy.array([0,133,77],numpy.uint8) and max_YCrCb = numpy.array([255,173,127],numpy.uint8) – Siusiubhan 8/6, 2013 at 20:42

@Siusiubhan :- Any reference to achieve this in iOS language – Robbert 18/11, 2018 at 6:40

A simple and powerful option is histogram backprojection. For example, create a 2D histogram using H and S (from HSV color space) or a* and b* (from La*b* color space), using pixels from different training images of your hand. Then use [cv2.calcBackProject][1] to classify the pixels in your stream. It's very fast and you should get 25 to 30 fps easily, I guess. Note this is a way to learn the color distribution of your object of interest. The same method can be used in other situations.

Groome answered 8/2, 2013 at 10:49 Comment(0)

Recommended topics

Hot tags