在 Android 上使用 ML Kit 偵測姿勢

ML Kit 提供兩個最佳化的 SDK 來偵測姿勢。

SDK 名稱姿勢偵測姿勢偵測準確
導入作業程式碼和資產在建構期間會以靜態方式連結至應用程式。程式碼和資產在建構期間會以靜態方式連結至應用程式。
對應用程式大小的影響 (包括程式碼和素材資源)約 10.1 MB約 13.3 MB
效能Pixel 3XL:約 30 FPSPixel 3XL:CPU 約 23 FPS,搭載 GPU 約 30 FPS

立即體驗

事前準備

  1. 在專案層級的 build.gradle 檔案中,請務必在 buildscriptallprojects 區段中加入 Google 的 Maven 存放區。
  2. 將 ML Kit Android 程式庫的依附元件新增至模組的應用程式層級 Gradle 檔案 (通常為 app/build.gradle):

    dependencies {
      // If you want to use the base sdk
      implementation 'com.google.mlkit:pose-detection:18.0.0-beta4'
      // If you want to use the accurate sdk
      implementation 'com.google.mlkit:pose-detection-accurate:18.0.0-beta4'
    }
    

1. 建立 PoseDetector 的執行個體

PoseDetector 種付款方式

如要偵測圖片中的姿勢,請先建立 PoseDetector 的執行個體,並視需要指定偵測工具設定。

偵測模式

PoseDetector 以兩種偵測模式運作。請務必選擇符合您用途的方案。

STREAM_MODE (預設)
姿勢偵測工具會先偵測圖片中最顯眼的人物,然後執行姿勢偵測。在後續影格中,除非人員遭遮擋,或系統再也無法偵測出人員是否具備高可信度,否則系統不會執行人員偵測步驟。姿勢偵測器會嘗試追蹤最顯眼的人物,並在每次推斷出時傳回姿勢。這樣可以縮短延遲時間和順暢偵測。如要偵測影片串流中的姿勢,請使用此模式。
SINGLE_IMAGE_MODE
姿勢偵測工具會偵測人物,然後執行姿勢偵測。系統會針對每張圖片執行人員偵測步驟,因此延遲時間會拉長,且沒有人追蹤。如要對靜態圖片進行姿勢偵測,或位於不需要追蹤的地點,請使用這個模式。

硬體設定

PoseDetector 支援多種硬體設定,可最佳化效能:

  • CPU:僅使用 CPU 執行偵測工具
  • CPU_GPU:同時使用 CPU 和 GPU 執行偵測工具

建構偵測工具選項時,您可以使用 API setPreferredHardwareConfigs 來控制硬體選擇。根據預設,所有硬體設定都會設為偏好項目。

ML Kit 會將各項設定的可用性、穩定性、正確性和延遲時間納入考量,並從偏好設定中挑選出最合適的設定。如果沒有適用的偏好設定,系統會自動使用 CPU 設定做為備用設定。ML Kit 會以非阻塞的方式執行這些檢查和相關準備作業,然後再啟用任何加速功能,因此很有可能是使用者首次執行偵測工具時,會使用 CPU。完成所有準備工作後,下列執行作業將會使用最佳設定。

setPreferredHardwareConfigs 的使用範例:

  • 如要讓 ML Kit 挑選最佳設定,請勿呼叫這個 API。
  • 如果不想啟用任何加速功能,請只傳入 CPU
  • 如要使用 GPU 卸載 CPU,即使 GPU 速度變慢,也只需傳入 CPU_GPU

指定姿勢偵測工具選項:

Kotlin

// Base pose detector with streaming frames, when depending on the pose-detection sdk
val options = PoseDetectorOptions.Builder()
    .setDetectorMode(PoseDetectorOptions.STREAM_MODE)
    .build()

// Accurate pose detector on static images, when depending on the pose-detection-accurate sdk
val options = AccuratePoseDetectorOptions.Builder()
    .setDetectorMode(AccuratePoseDetectorOptions.SINGLE_IMAGE_MODE)
    .build()

Java

// Base pose detector with streaming frames, when depending on the pose-detection sdk
PoseDetectorOptions options =
   new PoseDetectorOptions.Builder()
       .setDetectorMode(PoseDetectorOptions.STREAM_MODE)
       .build();

// Accurate pose detector on static images, when depending on the pose-detection-accurate sdk
AccuratePoseDetectorOptions options =
   new AccuratePoseDetectorOptions.Builder()
       .setDetectorMode(AccuratePoseDetectorOptions.SINGLE_IMAGE_MODE)
       .build();

最後,建立 PoseDetector 的執行個體。傳送您指定的選項:

Kotlin

val poseDetector = PoseDetection.getClient(options)

Java

PoseDetector poseDetector = PoseDetection.getClient(options);

2. 準備輸入圖片

如要偵測圖片中的姿勢,請從 Bitmapmedia.ImageByteBuffer、位元組陣列或裝置上的檔案建立 InputImage 物件。接著,將 InputImage 物件傳遞至 PoseDetector

如要偵測姿勢偵測,請使用尺寸至少為 480x360 像素的圖片。如果要即時偵測姿勢,以這個最低解析度擷取畫面有助於縮短延遲時間。

您可以從不同來源建立 InputImage 物件,以下說明每種來源。

使用 media.Image

如要從 media.Image 物件建立 InputImage 物件 (例如從裝置相機拍攝圖片),請將 media.Image 物件和圖片的旋轉角度傳遞至 InputImage.fromMediaImage()

如果您使用 CameraX 程式庫,OnImageCapturedListenerImageAnalysis.Analyzer 類別會為您計算旋轉值。

Kotlin

private class YourImageAnalyzer : ImageAnalysis.Analyzer {

    override fun analyze(imageProxy: ImageProxy) {
        val mediaImage = imageProxy.image
        if (mediaImage != null) {
            val image = InputImage.fromMediaImage(mediaImage, imageProxy.imageInfo.rotationDegrees)
            // Pass image to an ML Kit Vision API
            // ...
        }
    }
}

Java

private class YourAnalyzer implements ImageAnalysis.Analyzer {

    @Override
    public void analyze(ImageProxy imageProxy) {
        Image mediaImage = imageProxy.getImage();
        if (mediaImage != null) {
          InputImage image =
                InputImage.fromMediaImage(mediaImage, imageProxy.getImageInfo().getRotationDegrees());
          // Pass image to an ML Kit Vision API
          // ...
        }
    }
}

如果您並未使用提供圖片旋轉角度的相機程式庫,可以透過裝置中的裝置旋轉度和相機感應器方向來計算圖片:

Kotlin

private val ORIENTATIONS = SparseIntArray()

init {
    ORIENTATIONS.append(Surface.ROTATION_0, 0)
    ORIENTATIONS.append(Surface.ROTATION_90, 90)
    ORIENTATIONS.append(Surface.ROTATION_180, 180)
    ORIENTATIONS.append(Surface.ROTATION_270, 270)
}

/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
@Throws(CameraAccessException::class)
private fun getRotationCompensation(cameraId: String, activity: Activity, isFrontFacing: Boolean): Int {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    val deviceRotation = activity.windowManager.defaultDisplay.rotation
    var rotationCompensation = ORIENTATIONS.get(deviceRotation)

    // Get the device's sensor orientation.
    val cameraManager = activity.getSystemService(CAMERA_SERVICE) as CameraManager
    val sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION)!!

    if (isFrontFacing) {
        rotationCompensation = (sensorOrientation + rotationCompensation) % 360
    } else { // back-facing
        rotationCompensation = (sensorOrientation - rotationCompensation + 360) % 360
    }
    return rotationCompensation
}

Java

private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
static {
    ORIENTATIONS.append(Surface.ROTATION_0, 0);
    ORIENTATIONS.append(Surface.ROTATION_90, 90);
    ORIENTATIONS.append(Surface.ROTATION_180, 180);
    ORIENTATIONS.append(Surface.ROTATION_270, 270);
}

/**
 * Get the angle by which an image must be rotated given the device's current
 * orientation.
 */
@RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
private int getRotationCompensation(String cameraId, Activity activity, boolean isFrontFacing)
        throws CameraAccessException {
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
    int rotationCompensation = ORIENTATIONS.get(deviceRotation);

    // Get the device's sensor orientation.
    CameraManager cameraManager = (CameraManager) activity.getSystemService(CAMERA_SERVICE);
    int sensorOrientation = cameraManager
            .getCameraCharacteristics(cameraId)
            .get(CameraCharacteristics.SENSOR_ORIENTATION);

    if (isFrontFacing) {
        rotationCompensation = (sensorOrientation + rotationCompensation) % 360;
    } else { // back-facing
        rotationCompensation = (sensorOrientation - rotationCompensation + 360) % 360;
    }
    return rotationCompensation;
}

接著,將 media.Image 物件和旋轉角度值傳遞至 InputImage.fromMediaImage()

Kotlin

val image = InputImage.fromMediaImage(mediaImage, rotation)

Java

InputImage image = InputImage.fromMediaImage(mediaImage, rotation);

使用檔案 URI

如要從檔案 URI 建立 InputImage 物件,請將應用程式結構定義和檔案 URI 傳遞至 InputImage.fromFilePath()。如要使用 ACTION_GET_CONTENT 意圖提示使用者從圖片庫應用程式選取圖片,這項功能就非常實用。

Kotlin

val image: InputImage
try {
    image = InputImage.fromFilePath(context, uri)
} catch (e: IOException) {
    e.printStackTrace()
}

Java

InputImage image;
try {
    image = InputImage.fromFilePath(context, uri);
} catch (IOException e) {
    e.printStackTrace();
}

使用 ByteBufferByteArray

如要使用 ByteBufferByteArray 建立 InputImage 物件,請先按照先前針對 media.Image 輸入內容所述計算圖像旋轉角度。接著,使用緩衝區或陣列建立 InputImage 物件,搭配圖片的高度、寬度、色彩編碼格式和旋轉角度:

Kotlin

val image = InputImage.fromByteBuffer(
        byteBuffer,
        /* image width */ 480,
        /* image height */ 360,
        rotationDegrees,
        InputImage.IMAGE_FORMAT_NV21 // or IMAGE_FORMAT_YV12
)
// Or:
val image = InputImage.fromByteArray(
        byteArray,
        /* image width */ 480,
        /* image height */ 360,
        rotationDegrees,
        InputImage.IMAGE_FORMAT_NV21 // or IMAGE_FORMAT_YV12
)

Java

InputImage image = InputImage.fromByteBuffer(byteBuffer,
        /* image width */ 480,
        /* image height */ 360,
        rotationDegrees,
        InputImage.IMAGE_FORMAT_NV21 // or IMAGE_FORMAT_YV12
);
// Or:
InputImage image = InputImage.fromByteArray(
        byteArray,
        /* image width */480,
        /* image height */360,
        rotation,
        InputImage.IMAGE_FORMAT_NV21 // or IMAGE_FORMAT_YV12
);

使用 Bitmap

如要透過 Bitmap 物件建立 InputImage 物件,請建立下列宣告:

Kotlin

val image = InputImage.fromBitmap(bitmap, 0)

Java

InputImage image = InputImage.fromBitmap(bitmap, rotationDegree);

圖片是由 Bitmap 物件以旋轉度表示。

3. 處理圖片

將準備的 InputImage 物件傳遞至 PoseDetectorprocess 方法。

Kotlin

Task<Pose> result = poseDetector.process(image)
       .addOnSuccessListener { results ->
           // Task completed successfully
           // ...
       }
       .addOnFailureListener { e ->
           // Task failed with an exception
           // ...
       }

Java

Task<Pose> result =
        poseDetector.process(image)
                .addOnSuccessListener(
                        new OnSuccessListener<Pose>() {
                            @Override
                            public void onSuccess(Pose pose) {
                                // Task completed successfully
                                // ...
                            }
                        })
                .addOnFailureListener(
                        new OnFailureListener() {
                            @Override
                            public void onFailure(@NonNull Exception e) {
                                // Task failed with an exception
                                // ...
                            }
                        });

4. 取得偵測到姿勢的相關資訊

如果在圖片中偵測到人物,姿勢偵測 API 會傳回 Pose 物件和 33 PoseLandmark

如果人物不在圖片中,模型會在影格外指派缺少的地標座標,並給予較低的 InFrameConfidence 值。

如果在影格中未偵測到任何人,Pose 物件就不會包含 PoseLandmark

Kotlin

// Get all PoseLandmarks. If no person was detected, the list will be empty
val allPoseLandmarks = pose.getAllPoseLandmarks()

// Or get specific PoseLandmarks individually. These will all be null if no person
// was detected
val leftShoulder = pose.getPoseLandmark(PoseLandmark.LEFT_SHOULDER)
val rightShoulder = pose.getPoseLandmark(PoseLandmark.RIGHT_SHOULDER)
val leftElbow = pose.getPoseLandmark(PoseLandmark.LEFT_ELBOW)
val rightElbow = pose.getPoseLandmark(PoseLandmark.RIGHT_ELBOW)
val leftWrist = pose.getPoseLandmark(PoseLandmark.LEFT_WRIST)
val rightWrist = pose.getPoseLandmark(PoseLandmark.RIGHT_WRIST)
val leftHip = pose.getPoseLandmark(PoseLandmark.LEFT_HIP)
val rightHip = pose.getPoseLandmark(PoseLandmark.RIGHT_HIP)
val leftKnee = pose.getPoseLandmark(PoseLandmark.LEFT_KNEE)
val rightKnee = pose.getPoseLandmark(PoseLandmark.RIGHT_KNEE)
val leftAnkle = pose.getPoseLandmark(PoseLandmark.LEFT_ANKLE)
val rightAnkle = pose.getPoseLandmark(PoseLandmark.RIGHT_ANKLE)
val leftPinky = pose.getPoseLandmark(PoseLandmark.LEFT_PINKY)
val rightPinky = pose.getPoseLandmark(PoseLandmark.RIGHT_PINKY)
val leftIndex = pose.getPoseLandmark(PoseLandmark.LEFT_INDEX)
val rightIndex = pose.getPoseLandmark(PoseLandmark.RIGHT_INDEX)
val leftThumb = pose.getPoseLandmark(PoseLandmark.LEFT_THUMB)
val rightThumb = pose.getPoseLandmark(PoseLandmark.RIGHT_THUMB)
val leftHeel = pose.getPoseLandmark(PoseLandmark.LEFT_HEEL)
val rightHeel = pose.getPoseLandmark(PoseLandmark.RIGHT_HEEL)
val leftFootIndex = pose.getPoseLandmark(PoseLandmark.LEFT_FOOT_INDEX)
val rightFootIndex = pose.getPoseLandmark(PoseLandmark.RIGHT_FOOT_INDEX)
val nose = pose.getPoseLandmark(PoseLandmark.NOSE)
val leftEyeInner = pose.getPoseLandmark(PoseLandmark.LEFT_EYE_INNER)
val leftEye = pose.getPoseLandmark(PoseLandmark.LEFT_EYE)
val leftEyeOuter = pose.getPoseLandmark(PoseLandmark.LEFT_EYE_OUTER)
val rightEyeInner = pose.getPoseLandmark(PoseLandmark.RIGHT_EYE_INNER)
val rightEye = pose.getPoseLandmark(PoseLandmark.RIGHT_EYE)
val rightEyeOuter = pose.getPoseLandmark(PoseLandmark.RIGHT_EYE_OUTER)
val leftEar = pose.getPoseLandmark(PoseLandmark.LEFT_EAR)
val rightEar = pose.getPoseLandmark(PoseLandmark.RIGHT_EAR)
val leftMouth = pose.getPoseLandmark(PoseLandmark.LEFT_MOUTH)
val rightMouth = pose.getPoseLandmark(PoseLandmark.RIGHT_MOUTH)

Java

// Get all PoseLandmarks. If no person was detected, the list will be empty
List<PoseLandmark> allPoseLandmarks = pose.getAllPoseLandmarks();

// Or get specific PoseLandmarks individually. These will all be null if no person
// was detected
PoseLandmark leftShoulder = pose.getPoseLandmark(PoseLandmark.LEFT_SHOULDER);
PoseLandmark rightShoulder = pose.getPoseLandmark(PoseLandmark.RIGHT_SHOULDER);
PoseLandmark leftElbow = pose.getPoseLandmark(PoseLandmark.LEFT_ELBOW);
PoseLandmark rightElbow = pose.getPoseLandmark(PoseLandmark.RIGHT_ELBOW);
PoseLandmark leftWrist = pose.getPoseLandmark(PoseLandmark.LEFT_WRIST);
PoseLandmark rightWrist = pose.getPoseLandmark(PoseLandmark.RIGHT_WRIST);
PoseLandmark leftHip = pose.getPoseLandmark(PoseLandmark.LEFT_HIP);
PoseLandmark rightHip = pose.getPoseLandmark(PoseLandmark.RIGHT_HIP);
PoseLandmark leftKnee = pose.getPoseLandmark(PoseLandmark.LEFT_KNEE);
PoseLandmark rightKnee = pose.getPoseLandmark(PoseLandmark.RIGHT_KNEE);
PoseLandmark leftAnkle = pose.getPoseLandmark(PoseLandmark.LEFT_ANKLE);
PoseLandmark rightAnkle = pose.getPoseLandmark(PoseLandmark.RIGHT_ANKLE);
PoseLandmark leftPinky = pose.getPoseLandmark(PoseLandmark.LEFT_PINKY);
PoseLandmark rightPinky = pose.getPoseLandmark(PoseLandmark.RIGHT_PINKY);
PoseLandmark leftIndex = pose.getPoseLandmark(PoseLandmark.LEFT_INDEX);
PoseLandmark rightIndex = pose.getPoseLandmark(PoseLandmark.RIGHT_INDEX);
PoseLandmark leftThumb = pose.getPoseLandmark(PoseLandmark.LEFT_THUMB);
PoseLandmark rightThumb = pose.getPoseLandmark(PoseLandmark.RIGHT_THUMB);
PoseLandmark leftHeel = pose.getPoseLandmark(PoseLandmark.LEFT_HEEL);
PoseLandmark rightHeel = pose.getPoseLandmark(PoseLandmark.RIGHT_HEEL);
PoseLandmark leftFootIndex = pose.getPoseLandmark(PoseLandmark.LEFT_FOOT_INDEX);
PoseLandmark rightFootIndex = pose.getPoseLandmark(PoseLandmark.RIGHT_FOOT_INDEX);
PoseLandmark nose = pose.getPoseLandmark(PoseLandmark.NOSE);
PoseLandmark leftEyeInner = pose.getPoseLandmark(PoseLandmark.LEFT_EYE_INNER);
PoseLandmark leftEye = pose.getPoseLandmark(PoseLandmark.LEFT_EYE);
PoseLandmark leftEyeOuter = pose.getPoseLandmark(PoseLandmark.LEFT_EYE_OUTER);
PoseLandmark rightEyeInner = pose.getPoseLandmark(PoseLandmark.RIGHT_EYE_INNER);
PoseLandmark rightEye = pose.getPoseLandmark(PoseLandmark.RIGHT_EYE);
PoseLandmark rightEyeOuter = pose.getPoseLandmark(PoseLandmark.RIGHT_EYE_OUTER);
PoseLandmark leftEar = pose.getPoseLandmark(PoseLandmark.LEFT_EAR);
PoseLandmark rightEar = pose.getPoseLandmark(PoseLandmark.RIGHT_EAR);
PoseLandmark leftMouth = pose.getPoseLandmark(PoseLandmark.LEFT_MOUTH);
PoseLandmark rightMouth = pose.getPoseLandmark(PoseLandmark.RIGHT_MOUTH);

提高成效的訣竅

結果品質取決於輸入圖片的品質:

  • 為了讓 ML Kit 準確偵測姿勢,影像中的人物應有充足的像素資料;為獲得最佳效能,拍攝主體應至少有 256 x 256 像素。
  • 如果您在即時應用程式中偵測姿勢,可能還得考量輸入圖片的整體尺寸。較小的圖片可以更快處理,因此為了縮短延遲時間,請以較低解析度拍攝圖片,但必須遵守上述的解析度規定,並確保主題盡可能佔據圖片比例。
  • 圖片對焦不佳也可能會影響準確度。如果未取得可接受的結果,請要求使用者重新擷取圖片。

如想在即時應用程式中使用姿勢偵測功能,請按照下列指南操作,達成最佳影格速率:

  • 使用基本姿勢偵測 SDK 和 STREAM_MODE
  • 建議你拍攝解析度較低的圖片。不過,也請注意這個 API 的圖片尺寸規定。
  • 如果使用 Cameracamera2 API,請限制對偵測工具發出的呼叫。如果偵測工具執行時有新的影片影格,請捨棄影格。如需範例,請參閱快速入門導覽課程範例應用程式中的 VisionProcessorBase 類別。
  • 如果您使用 CameraX API,請務必將背壓策略設為預設值 ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST。這項功能可確保一次只會傳送一張圖片進行分析。如果分析器忙碌時生成更多圖片,圖片會自動捨棄,不會排入傳送佇列。呼叫 ImageProxy.close() 來關閉要分析的映像檔後,就會傳送下一個映像檔。
  • 如果您使用偵測工具的輸出內容,在輸入圖片上疊加圖像,請先透過 ML Kit 取得結果,然後透過單一步驟算繪圖片和重疊結果。這只會針對每個輸入影格算繪至螢幕介面一次。如需範例,請參閱快速入門導覽課程範例應用程式中的 CameraSourcePreview GraphicOverlay 類別。
  • 如果您使用 Camera2 API,請擷取 ImageFormat.YUV_420_888 格式的圖片。如果使用舊版 Camera API,請拍攝 ImageFormat.NV21 格式的圖片。

後續步驟

  • 如要瞭解如何使用姿勢標記將姿勢分類,請參閱「選擇分類提示」。