Subject Segmentation

ML Kit's subject segmentation API allows developers to easily separate multiple subjects from the background in a picture, enabling use cases such as sticker creation, background swap, or adding cool effects to subjects.

Subjects are defined as the most prominent people, pets, or objects in the foreground of the image. If 2 subjects are very close or touching each other, they are considered a single subject.

The subject segmentation API takes an input image and generates an output mask or bitmap for the foreground. It also provides a mask and bitmap for each one of the subjects detected (the foreground is equal to all subjects combined).

By default, the foreground mask and foreground bitmap are the same size as the input image (the size of each individual subject's mask and bitmap will likely differ from input image size). Each pixel of the mask is assigned a float number that has a range between 0.0 and 1.0. The closer the number is to 1.0, the higher the confidence that the pixel represents a subject, and vice versa.

On average the latency measured on Pixel 7 Pro is around 200 ms. This API currently only supports static images.


Key capabilities

  • Multi-subject segmentation: provides masks and bitmaps for each individual subject, rather than a single mask and bitmap for all subjects combined.
  • Subject recognition: subjects recognized are objects, pets, and humans.
  • On-device processing: all processing is performed on the device, preserving user privacy and requiring no network connectivity.

Example results

Input Image Output Image + Mask