Python: How to get Face Mesh landmarks coordinates in MediaPipe?
Asked Answered
D

6

16

I'm trying to get a list with landmark coordinates with MediaPipe's Face Mesh. For example: Landmark[6]: (0.36116672, 0.93204623, 0.0019629495)

I cant find the way to do that and would appreciate the help.

Drumbeat answered 17/4, 2021 at 18:50 Comment(0)
C
27

Mediapipe has more complex interface than most of the models you see publicly. But what you're looking for is easily achievable anyway.

import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
mp_face_mesh = mp.solutions.face_mesh

file_list = ['test.png']
# For static images:
drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
with mp_face_mesh.FaceMesh(
    static_image_mode=True,
    min_detection_confidence=0.5) as face_mesh:
  for idx, file in enumerate(file_list):
    image = cv2.imread(file)
    # Convert the BGR image to RGB before processing.
    results = face_mesh.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))

    # Print and draw face mesh landmarks on the image.
    if not results.multi_face_landmarks:
      continue
    annotated_image = image.copy()
    for face_landmarks in results.multi_face_landmarks:
      print('face_landmarks:', face_landmarks)
      mp_drawing.draw_landmarks(
          image=annotated_image,
          landmark_list=face_landmarks,
          connections=mp_face_mesh.FACE_CONNECTIONS,
          landmark_drawing_spec=drawing_spec,
          connection_drawing_spec=drawing_spec)

In this example, which is taken from here, you can see that they're iterating through results.multi_face_landmarks:

for face_landmarks in results.multi_face_landmarks:

Each iterable here consists of information about each face detected in the image, and length of results.multi_face_landmarks is number of faces detected in the image.

When you print attributes of let's say - first face, you'll see 'landmark' as a last attribute.

dir(results.multi_face_landmarks[0])
>> ..., 'landmark']

We need landmark attribute to acquire pixel coordinates after one step further.

Length of landmark attribute is 468, which basically is number of predicted [x,y,z] keypoints after regression.

If we take first keypoint:

results.multi_face_landmarks[0].landmark[0]

it will give us normalized [x,y,z] values:

x: 0.25341567397117615
y: 0.71121746301651
z: -0.03244325891137123

Finally, x, y and z here are attributes of each keypoint. We can check that by calling dir() on keypoint.

Now you can easily reach normalized pixel coordinates:

results.multi_face_landmarks[0].landmark[0].x -> X coordinate
results.multi_face_landmarks[0].landmark[0].y -> Y coordinate
results.multi_face_landmarks[0].landmark[0].z -> Z coordinate

For denormalization of pixel coordinates, we should multiply x coordinate by width and y coordinate by height.

Sample code:

for face in results.multi_face_landmarks:
    for landmark in face.landmark:
        x = landmark.x
        y = landmark.y

        shape = image.shape 
        relative_x = int(x * shape[1])
        relative_y = int(y * shape[0])

        cv2.circle(image, (relative_x, relative_y), radius=1, color=(225, 0, 100), thickness=1)
cv2_imshow(image)

Which would give us:

Click to see result image

Corder answered 20/4, 2021 at 18:22 Comment(2)
To make it work: * in line 26 of the first codeblock: FACE_CONNECTIONS has to be replaced by FACEFACEMESH_CONTOURS, * The last line in the last codeblock has to be cv2.imshow("image", image)Paramorph
@ReneSmit typo. It should be FACEMESH_CONTOURSUnfailing
K
1

Here is a full explanation -

Face Mesh MediaPipe

    import cv2
    import mediapipe as mp
    mp_drawing = mp.solutions.drawing_utils
    mp_face_mesh = mp.solutions.face_mesh
    
    # For static images:
    file_list = ['test.png']
    drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
    with mp_face_mesh.FaceMesh(
        static_image_mode=True,
        max_num_faces=1,
        min_detection_confidence=0.5) as face_mesh:
      for idx, file in enumerate(file_list):
        image = cv2.imread(file)
        # Convert the BGR image to RGB before processing.
        results = face_mesh.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
    
        # Print and draw face mesh landmarks on the image.
        if not results.multi_face_landmarks:
          continue
        annotated_image = image.copy()
        for face_landmarks in results.multi_face_landmarks:
          print('face_landmarks:', face_landmarks)
      
Kathrinkathrine answered 17/4, 2021 at 19:26 Comment(1)
you are missing the variable with the list of images to be read from, here file_list = ['test.png']Rori
C
1

Lets work with this particular image

enter image description here

Once load the image, we first instantiate the mediapipe solutions

face_mesh = mp.solutions.face_mesh.FaceMesh(static_image_mode=True, max_num_faces=2, min_detection_confidence=0.5)

and detect all faces via process as below

results = face_mesh.process(cv2.cvtColor(image_input , cv2.COLOR_BGR2RGB))

To access all the landmark, for this particular face, we can iterate throu the landmark via

ls_single_face=results.multi_face_landmarks[0].landmark
for idx in ls_single_face:
    print(idx.x,idx.y,idx.z)

Which will output the x, y, and z coordinate

0.6062703132629395 0.34374159574508667 -0.02611529268324375
0.6024502515792847 0.3223230540752411 -0.05503281578421593
0.6047719717025757 0.32883960008621216 -0.029224306344985962
0.5947933793067932 0.29429933428764343 -0.04156317934393883
0.6020699143409729 0.31391528248786926 -0.058685336261987686
0.6023058295249939 0.3025013208389282 -0.054952703416347504

The full code is as below

import cv2
import mediapipe as mp

dframe = cv2.imread("detect_face/person.png")

image_input = cv2.cvtColor(dframe, cv2.COLOR_BGR2RGB)

face_mesh = mp.solutions.face_mesh.FaceMesh(static_image_mode=True, max_num_faces=2,
                                         min_detection_confidence=0.5)
image_rows, image_cols, _ = dframe.shape
results = face_mesh.process(cv2.cvtColor(image_input , cv2.COLOR_BGR2RGB))

ls_single_face=results.multi_face_landmarks[0].landmark
for idx in ls_single_face:
    print(idx.x,idx.y,idx.z)

Using similar strategy, we can plot a marker for a the given face landmark by iterating each of the coordinate.

from mediapipe.python.solutions.drawing_utils import _normalized_to_pixel_coordinates
ls_single_face=results.multi_face_landmarks[0].landmark

for idx in ls_single_face:
    cord = _normalized_to_pixel_coordinates(idx.x,idx.y,image_cols,image_rows)
    cv2.putText(image_input, '.', cord,cv2.FONT_HERSHEY_SIMPLEX, 0.3, (0, 0, 255), 2)

Which will output

enter image description here

The original image was retrieved from this link.

Mediapipe also have the built in approach to detect key face region as discussed here

Cuff answered 14/5, 2022 at 11:9 Comment(1)
To those interested to find the eye coordinate position, can refer this OP https://mcmap.net/q/748914/-how-to-plot-a-marker-around-eye-region-according-to-face-landmarks-of-mediapipe-pythonCuff
T
0

Mediapipe's landmarks value is normalized by the width and height of the image. After, getting the landmark value simply multiple the x of the landmark with the width of your image and y of the landmark with the height of your image. You may check this link for a complete tutorial on mediapipe. It's under craft but is going to be completed very soon.

Thorndike answered 23/5, 2021 at 2:2 Comment(1)
You can simply used from mediapipe.python.solutions.drawing_utils import _normalized_to_pixel_coordinates _normalized_to_pixel_coordinates(idx.x,idx.y,image_cols,image_rows) to geth the actual positionCuff
B
0
  • To print the coordinates of the landmarks you have to check if they exist and after that you can access x, y and z coordinates.The code for landmark 0 is:
        #in the cycle of capture
        if results.multi_face_landmarks:
            coord= results.multi_face_landmarks.landmark[0]
            print(''.join(['(',str(coord.x),',',str(coord.y),',',str(coord.z)  ,')']))
Buckbuckaroo answered 24/9, 2021 at 8:54 Comment(2)
How would you create a bounding box?Thitherto
@Thitherto if you want to do retangule , just need tio use two points and you can use cv2.rectangle(image, coord1,coord2,coord3,coord4) . If you want to draw the landmarks, particularly of the hands, you can do this: mp_drawing = mp.solutions.drawing_utils mp_drawing.draw_landmarks(frame, handLMs, mphands.HAND_CONNECTIONS) Laritalariviere
L
0

I stumbled upon this thread, when I was looking for this answer myself. While the provided answer was very helpful, it is a bit outdated, hence I wanted to provide an update on the answer provided by @deepconsc, on Apr 20, 2021 at 18:22, see above.

The problem with the current answer is that "multi_face_landmarks" does not exist in the latest version of mediapipe. To make it work again, replace

for face in results.multi_face_landmarks:

with

for face in results.face_landmarks:

I hope someone finds this helpful.

Longsufferance answered 24/7, 2023 at 13:15 Comment(1)
What I can't figure out is where the "named" landmarks have gone to - all old examples are able to use stuff like FACEMESH_NOSE, LEFT_EYE, RIGHT_EYE, NOSE_TIP etc... but now it appears all we have is 400+ anonymous x,y,z x coordinates... which is almost useless if you need the landmarks for specific areas. Documentation for Mediapipe is a mess, and all searches leads to old outdated crap, and all their examples are rubbish too with each using different conflicting code.Chlorinate

© 2022 - 2025 — McMap. All rights reserved.