זיהוי תנוחות באמצעות ML Kit ב-iOS

‫ML Kit מספק שתי ערכות SDK שעברו אופטימיזציה לזיהוי תנוחות.

שם ה-SDK	PoseDetection	PoseDetectionAccurate
הטמעה	נכסים של גלאי הבסיס מקושרים באופן סטטי לאפליקציה בזמן הבנייה.	נכסים לזיהוי מדויק מקושרים באופן סטטי לאפליקציה בזמן הבנייה.
גודל האפליקציה	עד ‎29.6MB	עד 33.2MB
ביצועים	iPhone X: ~45 פריימים לשנייה	iPhone X: ~29FPS

רוצה לנסות?

כדאי להתנסות באפליקציית הדוגמה כדי לראות דוגמה לשימוש ב-API הזה.

לפני שמתחילים

צריך לכלול את ה-pods הבאים של ML Kit ב-Podfile:

# If you want to use the base implementation:
pod 'GoogleMLKit/PoseDetection', '8.0.0'

# If you want to use the accurate implementation:
pod 'GoogleMLKit/PoseDetectionAccurate', '8.0.0'

אחרי שמתקינים או מעדכנים את ה-pods של הפרויקט, פותחים את פרויקט Xcode באמצעות xcworkspace שלו. ‫ML Kit נתמך ב-Xcode בגרסה 13.2.1 ומעלה.

1. צור מופע של `PoseDetector`

כדי לזהות תנוחה בתמונה, קודם יוצרים מופע של PoseDetector ואפשר גם לציין את הגדרות הגלאי.

`PoseDetector` אפשרויות

מצב זיהוי

‫PoseDetector פועל בשני מצבי זיהוי. חשוב לבחור את האפשרות שמתאימה לתרחיש השימוש שלכם.

‫stream (ברירת מחדל): הכלי לזיהוי תנוחות יזהה קודם את האדם הבולט ביותר בתמונה ואז יבצע זיהוי תנוחות. בפריימים הבאים, שלב זיהוי האדם לא יתבצע אלא אם האדם יוסתר או שלא יזוהה יותר ברמת ודאות גבוהה. גלאי התנוחות ינסה לעקוב אחרי האדם הבולט ביותר ויחזיר את התנוחה שלו בכל הסקה. כך מצמצמים את זמן האחזור ומשפרים את הזיהוי. כדאי להשתמש במצב הזה כשרוצים לזהות תנוחה בזרם וידאו.
singleImage: גלאי התנוחות יזהה אדם ואז יפעיל זיהוי תנוחות. שלב זיהוי האנשים יפעל לכל תמונה, ולכן זמן האחזור יהיה גבוה יותר ולא תתבצע מעקב אחר אנשים. משתמשים במצב הזה כשמשתמשים בזיהוי תנוחות בתמונות סטטיות או כשלא רוצים לעקוב אחרי התנוחות.

מציינים את האפשרויות של גלאי התנוחות:

Swift

// Base pose detector with streaming, when depending on the PoseDetection SDK
let options = PoseDetectorOptions()
options.detectorMode = .stream

// Accurate pose detector on static images, when depending on the
// PoseDetectionAccurate SDK
let options = AccuratePoseDetectorOptions()
options.detectorMode = .singleImage

Objective-C

// Base pose detector with streaming, when depending on the PoseDetection SDK
MLKPoseDetectorOptions *options = [[MLKPoseDetectorOptions alloc] init];
options.detectorMode = MLKPoseDetectorModeStream;

// Accurate pose detector on static images, when depending on the
// PoseDetectionAccurate SDK
MLKAccuratePoseDetectorOptions *options =
    [[MLKAccuratePoseDetectorOptions alloc] init];
options.detectorMode = MLKPoseDetectorModeSingleImage;

לבסוף, מקבלים מופע של PoseDetector. מעבירים את האפשרויות שציינתם:

Swift

let poseDetector = PoseDetector.poseDetector(options: options)

Objective-C

MLKPoseDetector *poseDetector =
    [MLKPoseDetector poseDetectorWithOptions:options];

2. הכנת תמונת הקלט

כדי לזהות תנוחות, מבצעים את הפעולות הבאות לכל תמונה או פריים של סרטון. אם הפעלתם את מצב הזרמת הנתונים, אתם צריכים ליצור אובייקטים מסוג VisionImage מתוך CMSampleBuffer.

יוצרים אובייקט VisionImage באמצעות UIImage או CMSampleBuffer.

אם אתם משתמשים ב-UIImage, פועלים לפי השלבים הבאים:

יוצרים אובייקט VisionImage באמצעות UIImage. חשוב לציין את .orientation הנכון.

Swift

let image = VisionImage(image: UIImage)
visionImage.orientation = image.imageOrientation

Objective-C

MLKVisionImage *visionImage = [[MLKVisionImage alloc] initWithImage:image];
visionImage.orientation = image.imageOrientation;

אם אתם משתמשים ב-CMSampleBuffer, פועלים לפי השלבים הבאים:

מציינים את האוריינטציה של נתוני התמונה שמופיעים בתג CMSampleBuffer.

כדי לקבל את כיוון התמונה:

Swift

func imageOrientation(
  deviceOrientation: UIDeviceOrientation,
  cameraPosition: AVCaptureDevice.Position
) -> UIImage.Orientation {
  switch deviceOrientation {
  case .portrait:
    return cameraPosition == .front ? .leftMirrored : .right
  case .landscapeLeft:
    return cameraPosition == .front ? .downMirrored : .up
  case .portraitUpsideDown:
    return cameraPosition == .front ? .rightMirrored : .left
  case .landscapeRight:
    return cameraPosition == .front ? .upMirrored : .down
  case .faceDown, .faceUp, .unknown:
    return .up
  }
}

Objective-C

- (UIImageOrientation)
  imageOrientationFromDeviceOrientation:(UIDeviceOrientation)deviceOrientation
                         cameraPosition:(AVCaptureDevicePosition)cameraPosition {
  switch (deviceOrientation) {
    case UIDeviceOrientationPortrait:
      return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationLeftMirrored
                                                            : UIImageOrientationRight;

    case UIDeviceOrientationLandscapeLeft:
      return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationDownMirrored
                                                            : UIImageOrientationUp;
    case UIDeviceOrientationPortraitUpsideDown:
      return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationRightMirrored
                                                            : UIImageOrientationLeft;
    case UIDeviceOrientationLandscapeRight:
      return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationUpMirrored
                                                            : UIImageOrientationDown;
    case UIDeviceOrientationUnknown:
    case UIDeviceOrientationFaceUp:
    case UIDeviceOrientationFaceDown:
      return UIImageOrientationUp;
  }
}

יוצרים אובייקט VisionImage באמצעות האובייקט CMSampleBuffer והכיוון:

Swift

let image = VisionImage(buffer: sampleBuffer)
image.orientation = imageOrientation(
  deviceOrientation: UIDevice.current.orientation,
  cameraPosition: cameraPosition)

Objective-C

 MLKVisionImage *image = [[MLKVisionImage alloc] initWithBuffer:sampleBuffer];
 image.orientation =
   [self imageOrientationFromDeviceOrientation:UIDevice.currentDevice.orientation
                                cameraPosition:cameraPosition];

3. עיבוד התמונה

מעבירים את VisionImage לאחת משיטות עיבוד התמונה של גלאי התנוחות. אפשר להשתמש בשיטה האסינכרונית process(image:) או בשיטה הסינכרונית results().

כדי לזהות אובייקטים באופן סינכרוני:

Swift

var results: [Pose]
do {
  results = try poseDetector.results(in: image)
} catch let error {
  print("Failed to detect pose with error: \(error.localizedDescription).")
  return
}
guard let detectedPoses = results, !detectedPoses.isEmpty else {
  print("Pose detector returned no results.")
  return
}

// Success. Get pose landmarks here.

Objective-C

NSError *error;
NSArray *poses = [poseDetector resultsInImage:image error:&error];
if (error != nil) {
  // Error.
  return;
}
if (poses.count == 0) {
  // No pose detected.
  return;
}

// Success. Get pose landmarks here.

כדי לזהות אובייקטים באופן אסינכרוני:

Swift

poseDetector.process(image) { detectedPoses, error in
  guard error == nil else {
    // Error.
    return
  }
  guard !detectedPoses.isEmpty else {
    // No pose detected.
    return
  }

  // Success. Get pose landmarks here.
}

Objective-C

[poseDetector processImage:image
                completion:^(NSArray * _Nullable poses,
                             NSError * _Nullable error) {
                    if (error != nil) {
                      // Error.
                      return;
                    }
                    if (poses.count == 0) {
                      // No pose detected.
                      return;
                    }

                    // Success. Get pose landmarks here.
                  }];

4. קבלת מידע על התנוחה שזוהתה

אם מזוהה אדם בתמונה, ה-API לזיהוי תנוחות מעביר מערך של אובייקטים מסוג Pose ל-completion handler או מחזיר את המערך, בהתאם לשיטה האסינכרונית או הסינכרונית שנקראה.

אם האדם לא היה בתוך התמונה באופן מלא, המודל מקצה את קואורדינטות נקודות הציון החסרות מחוץ למסגרת, ונותן להן ערכי מהימנות נמוכים לגבי המיקום במסגרת.

אם לא זוהה אדם, המערך ריק.

Swift

for pose in detectedPoses {
  let leftAnkleLandmark = pose.landmark(ofType: .leftAnkle)
  if leftAnkleLandmark.inFrameLikelihood > 0.5 {
    let position = leftAnkleLandmark.position
  }
}

Objective-C

for (MLKPose *pose in detectedPoses) {
  MLKPoseLandmark *leftAnkleLandmark =
      [pose landmarkOfType:MLKPoseLandmarkTypeLeftAnkle];
  if (leftAnkleLandmark.inFrameLikelihood > 0.5) {
    MLKVision3DPoint *position = leftAnkleLandmark.position;
  }
}

טיפים לשיפור הביצועים

איכות התוצאות תלויה באיכות של תמונת המקור:

כדי ש-ML Kit יזהה תנוחה בצורה מדויקת, האדם בתמונה צריך להיות מיוצג על ידי נתוני פיקסלים מספיקים. כדי להשיג את הביצועים הטובים ביותר, הנושא צריך להיות בגודל של לפחות 256x256 פיקסלים.
אם מזהים תנוחה באפליקציה בזמן אמת, כדאי גם לקחת בחשבון את המידות הכוללות של תמונות הקלט. עיבוד תמונות קטנות יותר מהיר יותר, ולכן כדי לצמצם את זמן האחזור, כדאי לצלם תמונות ברזולוציות נמוכות יותר. עם זאת, חשוב לזכור את דרישות הרזולוציה שצוינו למעלה ולוודא שהנושא יתפוס כמה שיותר מהתמונה.
גם פוקוס לא טוב של התמונה יכול להשפיע על הדיוק. אם התוצאות לא מספיק טובות, מבקשים מהמשתמש לצלם מחדש את התמונה.

אם אתם רוצים להשתמש בזיהוי תנוחות באפליקציה בזמן אמת, כדאי לפעול לפי ההנחיות הבאות כדי להשיג את קצב הפריימים הטוב ביותר:

משתמשים ב-PoseDetection SDK הבסיסי ובמצב הזיהוי stream.
כדאי לצלם תמונות ברזולוציה נמוכה יותר. עם זאת, חשוב לזכור גם את הדרישות לגבי מידות התמונה של ה-API הזה.
כדי לעבד פריימים של סרטונים, משתמשים ב-API הסינכרוני של הגלאי results(in:). מפעילים את השיטה הזו מהפונקציה AVCaptureVideoDataOutputSampleBufferDelegate כדי לקבל תוצאות באופן סינכרוני מפריימים נתונים של סרטון. כדי להגביל את הקריאות לגלאי, צריך להגדיר את alwaysDiscardsLateVideoFrames של AVCaptureVideoDataOutput כ-true. אם פריים חדש של סרטון יהיה זמין בזמן שהגלאי פועל, הוא ייפסל.
אם משתמשים בפלט של הגלאי כדי להוסיף שכבת גרפיקה על תמונת הקלט, צריך קודם לקבל את התוצאה מ-ML Kit, ואז לעבד את התמונה ואת שכבת הגרפיקה בשלב אחד. כך, הרינדור מתבצע רק פעם אחת לכל פריים קלט שעבר עיבוד. דוגמה אפשר לראות במחלקות previewOverlayView ו-MLKDetectionOverlayView באפליקציית הדוגמה showcase.

השלבים הבאים

במאמר טיפים לסיווג תנוחות מוסבר איך להשתמש בנקודות ציון של תנוחות כדי לסווג תנוחות.
אפשר לראות דוגמה לשימוש ב-API הזה בדוגמה למתחילים של ML Kit ב-GitHub.

זיהוי תנוחות באמצעות ML Kit ב-iOS קל לארגן דפים בעזרת אוספים אפשר לשמור ולסווג תוכן על סמך ההעדפות שלך.

רוצה לנסות?

לפני שמתחילים

1. צור מופע של PoseDetector

PoseDetector אפשרויות

מצב זיהוי

Swift

Objective-C

Swift

Objective-C

2. הכנת תמונת הקלט

Swift

Objective-C

Swift

Objective-C

Swift

Objective-C

3. עיבוד התמונה

Swift

Objective-C

Swift

Objective-C

4. קבלת מידע על התנוחה שזוהתה

Swift

Objective-C

טיפים לשיפור הביצועים

השלבים הבאים

זיהוי תנוחות באמצעות ML Kit ב-iOS

1. צור מופע של `PoseDetector`

`PoseDetector` אפשרויות