Face mesh detection

Page Summary

ML Kit's Face Mesh Detection API generates a real-time, high-accuracy mesh of 468 3D points for selfie-like images, ideal for AR filters and video chat, with faces within ~2 meters of the camera.
This API provides key capabilities such as recognizing and locating faces, getting face mesh information, and processing video frames in real time.
For detecting faces further than ~2 meters away or requiring face classification features like smiling, the ML Kit Face Detection API is recommended.
This API is currently in beta and subject to change.

With ML Kit's face mesh detection API, you can generate in real-time a high accuracy mesh of 468 3D points for selfie-like images. Faces should be within ~2 meters (~7 feet) of the camera.

If you want to detect faces further than ~2 meters (~7 feet) away from the camera, please see ML Kit's face detection SDK.

Here are some of the terms used regarding the face mesh detection feature:

The bounding box is a rectangular area for a detected face.
Face mesh info is a group of 468 3D points and edges that can be used to draw the geometry mesh for a detected face.

The face mesh detection API generates a face mesh for detected faces, each containing 468 3D points and edges. With face mesh detection, you can perform more accurate operations on faces real-time, such as AR filters, selfie capture, and video chat.

Android

Key capabilities

Recognize and locate faces Get the bounding box for detected faces in a selfie-like picture.
Get face mesh information Get the 468 3D points and triangle info for each detected face.
Process video frames in real time Face mesh detection is performed on-device, and is fast enough for real-time applications, such as video manipulation.

Example results

Input	Output ("Bounding box only" mode)	Output ("Face mesh" mode)

Comparison with ML Kit face detection SDK

	Face mesh detection API	Face Detection API
Use case recommended (examples)	Generate AR effects on faces in video streaming Real-time face detection in selfie-like pictures (face within ~2 meters)	Detect how many faces are present in a picture Detect faces far away from the camera
Latency	Low (~14ms on Pixel 3) Recommended for real-time	Medium (~60ms on Pixel 3 when fast mode is ON)
Recommended input	Faces captured within ~2 meters (~7 feet)	Any picture with faces
Face points output	For each face, 468 3D points and triangle info when "face mesh" mode is enabled.	For each face, 133 2D points when "face contour" mode is enabled.
# faces recognized	"Bounding box only" mode: >=1 bounding box(es) as long as faces are close to camera (<= ~2 meters or ~7 feet away) "Face mesh" mode: max 2 bounding boxes and meshes, as long as faces are close to camera (within ~2 meters away)	"Bounding box" mode: >=1; faces can be far from camera, but minimum size of 100x100 pixels per face Face contours: max 1, as long as faces are close to camera
Tracking id	No	Yes
Face orientation	No	Yes
Face classification (e.g. smiling)	No	Yes
Implementation options	Bundled only	Bundled / Unbundled
App size	Bundled: ~6.4 MB Unbundled: not available yet	Bundled: ~6.9Mb Unbundled: ~0.6Mb