Head Pose Estimation with OpenCV, C++ and Image 2D - Geometric Method - Roll, Yaw and Pitch

Asked 6/5, 2013 at 14:58 Answered 23/2, 2015 at 10:6

I'm trying to find the three angles of the face of a person, based on a 2D image .

I'm using OpenCV with HaarCascade to find the face, eyes, nose and mouth. But I don't found any geometric method that can help me to find the angles X, Y and Z (Roll, Pitch and Yaw).

Could someone help me showing some method in c++ or java that works?

Seedtime answered 6/5, 2013 at 14:58 Comment(1)

This is not a C++ question so I removed that tag. – Flageolet 6/5, 2013 at 15:15

Given a single image and no other information, there is no single solution for the angles. Consider the case of just Yaw. Projected onto the 2d plane, this is visible as a small change in the projected distance between eyes and the placement of the eyes with respect to the nose/mouth. This distance is not a constant from person to person, however.

One typical way around this is to require that the user 'calibrate' their face by looking directly at the camera for the nominal '0' angles. At this point, you now have reference lengths against which you can compare subsequent images.

The lengths are still not quite enough information, however, as the amount that the apparent projected distances change depends on the optics and the distance of the face from the camera. The optics you usually configure manually; the distance you can estimate by assuming 'average' facial dimensions and assuming the 'nominal' image matches those dimensions perfectly. You can make this adjustable if you find that it's over- or under- estimating the rotations for a particular face.

Once you have all these assumptions in place, it's fairly simple geometry. You can estimate roll from the line from the eyes through the nose to the mouth. You can measure the spacing between the eyes to estimate yaw. Finally, you can estimate pitch using the spacing between eyes/mouth or eyes/nose. Bear in mind, these assumptions work best when the face is still fairly close to nominal.

Miner answered 6/5, 2013 at 15:31 Comment(1)

But I would like to know some method in C++ to do this. – Seedtime 7/5, 2013 at 13:8

If you use a cascade classifier to detect the right eye, left eye and nose, calculate the centroid of each feature (feature x/2, feature y/2) this will give you three x-y points on your image.

You can detect roll by looking at the Y values of each eye, if one is higher than the other, it means the head is tilted in the direction of the lowest Y value (as one eye moves up the other moves down)

You can detect yaw by looking at the X value of the nose, if the user looks to their left, the X value of their nose will be closer to their left eye's X value, and same with looking to the right at the right eyes X value.

You can detect pitch by looking at the Y value of the nose, if the user is looking up, the Y value will be closer to both eyes Y values and if they look down, the Y value will be further away from the eye value.

Now this is of course not tremendously accurate and won't give you exact angles, however you can use this information to try and classify each value within certain groups i.e (looking forward, looking left, looking really left)

The only thing I can see effecting you calculating all three in one image might be if the roll is fairly drastic calculating the yaw might be troublesome as the X axis is no longer flat.

You can solve this by correcting the image through 2D rotation. You will need to find how much the image needs to be rotated with

Value = (right eye Y / 2) - (left eye Y / 2)

With this information you can correct the image and continue with processing (to rotate the image look up creating a 2D rotation matrix and using warp affine)

Sorry if this is a bit of a necro but I found the above method to be pretty successful and I hope it help someone

Aholla answered 23/2, 2015 at 10:6 Comment(0)

So, you want to finding the orientation (in RPY-angles) of a face based on the position of the nose, eyes and mouth. Assuming that all three (four - two eyes) are visible, I would use the symmetric features of the face for determining the head's orientation, such as:

A line between the eyes could be used as a reference for one of the axes (for instance the Pitch). Then, we may assume that the Roll axis points in the nose's direction - which can be measured through the positional displacement of the nose to the mid-point between the eyes. And lastly, the Yaw could be measured through the distance relation between the mid-point between the eyes, the position of the nose, and the mouth's position.

I do not know the distance relations between the four interest points, and they probably are different with regards to gender, age, and origin. However, if you can find such a relation, the derivation of the angles should mathematically be rather straight forward.

Interesting application by the way!

Girondist answered 6/5, 2014 at 8:34 Comment(0)

Recommended topics

Hot tags