Python: How to get Face Mesh landmarks coordinates in MediaPipe?

Asked 17/4, 2021 at 18:50 Answered 24/7, 2023 at 13:15

python mediapipe facial-landmark-alignment

I'm trying to get a list with landmark coordinates with MediaPipe's Face Mesh. For example: Landmark[6]: (0.36116672, 0.93204623, 0.0019629495)

I cant find the way to do that and would appreciate the help.

Drumbeat answered 17/4, 2021 at 18:50 Comment(0)

Mediapipe has more complex interface than most of the models you see publicly. But what you're looking for is easily achievable anyway.

import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
mp_face_mesh = mp.solutions.face_mesh

file_list = ['test.png']
# For static images:
drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
with mp_face_mesh.FaceMesh(
    static_image_mode=True,
    min_detection_confidence=0.5) as face_mesh:
  for idx, file in enumerate(file_list):
    image = cv2.imread(file)
    # Convert the BGR image to RGB before processing.
    results = face_mesh.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))

    # Print and draw face mesh landmarks on the image.
    if not results.multi_face_landmarks:
      continue
    annotated_image = image.copy()
    for face_landmarks in results.multi_face_landmarks:
      print('face_landmarks:', face_landmarks)
      mp_drawing.draw_landmarks(
          image=annotated_image,
          landmark_list=face_landmarks,
          connections=mp_face_mesh.FACE_CONNECTIONS,
          landmark_drawing_spec=drawing_spec,
          connection_drawing_spec=drawing_spec)

In this example, which is taken from here, you can see that they're iterating through results.multi_face_landmarks:

for face_landmarks in results.multi_face_landmarks:

Each iterable here consists of information about each face detected in the image, and length of results.multi_face_landmarks is number of faces detected in the image.

When you print attributes of let's say - first face, you'll see 'landmark' as a last attribute.

dir(results.multi_face_landmarks[0])
>> ..., 'landmark']

We need landmark attribute to acquire pixel coordinates after one step further.

Length of landmark attribute is 468, which basically is number of predicted [x,y,z] keypoints after regression.

If we take first keypoint:

results.multi_face_landmarks[0].landmark[0]

it will give us normalized [x,y,z] values:

x: 0.25341567397117615
y: 0.71121746301651
z: -0.03244325891137123

Finally, x, y and z here are attributes of each keypoint. We can check that by calling dir() on keypoint.

Now you can easily reach normalized pixel coordinates:

results.multi_face_landmarks[0].landmark[0].x -> X coordinate
results.multi_face_landmarks[0].landmark[0].y -> Y coordinate
results.multi_face_landmarks[0].landmark[0].z -> Z coordinate

For denormalization of pixel coordinates, we should multiply x coordinate by width and y coordinate by height.

Sample code:

for face in results.multi_face_landmarks:
    for landmark in face.landmark:
        x = landmark.x
        y = landmark.y

        shape = image.shape 
        relative_x = int(x * shape[1])
        relative_y = int(y * shape[0])

        cv2.circle(image, (relative_x, relative_y), radius=1, color=(225, 0, 100), thickness=1)
cv2_imshow(image)

Which would give us:

Click to see result image

Corder answered 20/4, 2021 at 18:22 Comment(2)

To make it work: * in line 26 of the first codeblock: FACE_CONNECTIONS has to be replaced by FACEFACEMESH_CONTOURS, * The last line in the last codeblock has to be cv2.imshow("image", image) – Paramorph 3/7, 2022 at 11:17

@ReneSmit typo. It should be FACEMESH_CONTOURS – Unfailing 7/10, 2023 at 14:17

Here is a full explanation -

Face Mesh MediaPipe

    import cv2
    import mediapipe as mp
    mp_drawing = mp.solutions.drawing_utils
    mp_face_mesh = mp.solutions.face_mesh
    
    # For static images:
    file_list = ['test.png']
    drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
    with mp_face_mesh.FaceMesh(
        static_image_mode=True,
        max_num_faces=1,
        min_detection_confidence=0.5) as face_mesh:
      for idx, file in enumerate(file_list):
        image = cv2.imread(file)
        # Convert the BGR image to RGB before processing.
        results = face_mesh.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
    
        # Print and draw face mesh landmarks on the image.
        if not results.multi_face_landmarks:
          continue
        annotated_image = image.copy()
        for face_landmarks in results.multi_face_landmarks:
          print('face_landmarks:', face_landmarks)

Kathrinkathrine answered 17/4, 2021 at 19:26 Comment(1)

you are missing the variable with the list of images to be read from, here file_list = ['test.png'] – Rori 19/11, 2021 at 13:41

Lets work with this particular image

Once load the image, we first instantiate the mediapipe solutions

face_mesh = mp.solutions.face_mesh.FaceMesh(static_image_mode=True, max_num_faces=2, min_detection_confidence=0.5)

and detect all faces via process as below

results = face_mesh.process(cv2.cvtColor(image_input , cv2.COLOR_BGR2RGB))

To access all the landmark, for this particular face, we can iterate throu the landmark via

ls_single_face=results.multi_face_landmarks[0].landmark
for idx in ls_single_face:
    print(idx.x,idx.y,idx.z)

Which will output the x, y, and z coordinate

0.6062703132629395 0.34374159574508667 -0.02611529268324375
0.6024502515792847 0.3223230540752411 -0.05503281578421593
0.6047719717025757 0.32883960008621216 -0.029224306344985962
0.5947933793067932 0.29429933428764343 -0.04156317934393883
0.6020699143409729 0.31391528248786926 -0.058685336261987686
0.6023058295249939 0.3025013208389282 -0.054952703416347504

The full code is as below

import cv2
import mediapipe as mp

dframe = cv2.imread("detect_face/person.png")

image_input = cv2.cvtColor(dframe, cv2.COLOR_BGR2RGB)

face_mesh = mp.solutions.face_mesh.FaceMesh(static_image_mode=True, max_num_faces=2,
                                         min_detection_confidence=0.5)
image_rows, image_cols, _ = dframe.shape
results = face_mesh.process(cv2.cvtColor(image_input , cv2.COLOR_BGR2RGB))

ls_single_face=results.multi_face_landmarks[0].landmark
for idx in ls_single_face:
    print(idx.x,idx.y,idx.z)

Using similar strategy, we can plot a marker for a the given face landmark by iterating each of the coordinate.

from mediapipe.python.solutions.drawing_utils import _normalized_to_pixel_coordinates
ls_single_face=results.multi_face_landmarks[0].landmark

for idx in ls_single_face:
    cord = _normalized_to_pixel_coordinates(idx.x,idx.y,image_cols,image_rows)
    cv2.putText(image_input, '.', cord,cv2.FONT_HERSHEY_SIMPLEX, 0.3, (0, 0, 255), 2)

Which will output

The original image was retrieved from this link.

Mediapipe also have the built in approach to detect key face region as discussed here

Cuff answered 14/5, 2022 at 11:9 Comment(1)

To those interested to find the eye coordinate position, can refer this OP https://mcmap.net/q/748914/-how-to-plot-a-marker-around-eye-region-according-to-face-landmarks-of-mediapipe-python – Cuff 14/5, 2022 at 11:12

Mediapipe's landmarks value is normalized by the width and height of the image. After, getting the landmark value simply multiple the x of the landmark with the width of your image and y of the landmark with the height of your image. You may check this link for a complete tutorial on mediapipe. It's under craft but is going to be completed very soon.

Thorndike answered 23/5, 2021 at 2:2 Comment(1)

You can simply used from mediapipe.python.solutions.drawing_utils import _normalized_to_pixel_coordinates _normalized_to_pixel_coordinates(idx.x,idx.y,image_cols,image_rows) to geth the actual position – Cuff 14/5, 2022 at 13:2

To print the coordinates of the landmarks you have to check if they exist and after that you can access x, y and z coordinates.The code for landmark 0 is:

        #in the cycle of capture
        if results.multi_face_landmarks:
            coord= results.multi_face_landmarks.landmark[0]
            print(''.join(['(',str(coord.x),',',str(coord.y),',',str(coord.z)  ,')']))

Buckbuckaroo answered 24/9, 2021 at 8:54 Comment(2)

How would you create a bounding box? – Thitherto 6/2, 2023 at 22:45

@Thitherto if you want to do retangule , just need tio use two points and you can use cv2.rectangle(image, coord1,coord2,coord3,coord4) . If you want to draw the landmarks, particularly of the hands, you can do this: mp_drawing = mp.solutions.drawing_utils mp_drawing.draw_landmarks(frame, handLMs, mphands.HAND_CONNECTIONS) – Laritalariviere 8/2, 2023 at 10:18

I stumbled upon this thread, when I was looking for this answer myself. While the provided answer was very helpful, it is a bit outdated, hence I wanted to provide an update on the answer provided by @deepconsc, on Apr 20, 2021 at 18:22, see above.

The problem with the current answer is that "multi_face_landmarks" does not exist in the latest version of mediapipe. To make it work again, replace

for face in results.multi_face_landmarks:

with

for face in results.face_landmarks:

I hope someone finds this helpful.

Longsufferance answered 24/7, 2023 at 13:15 Comment(1)

What I can't figure out is where the "named" landmarks have gone to - all old examples are able to use stuff like FACEMESH_NOSE, LEFT_EYE, RIGHT_EYE, NOSE_TIP etc... but now it appears all we have is 400+ anonymous x,y,z x coordinates... which is almost useless if you need the landmarks for specific areas. Documentation for Mediapipe is a mess, and all searches leads to old outdated crap, and all their examples are rubbish too with each using different conflicting code. – Chlorinate 11/8, 2023 at 18:50

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags