Pose Estimation on Android

Developers using Fritz AI can now track the body positions of people in images and video with Pose Estimation.

What is pose estimation?

Human pose estimation is the computer vision task of determining a person’s body position in a video or image. A picture goes in, and a machine learning model outputs the coordinates of detected body parts (e.g. shoulders, elbows, wrists, etc.) along with a confidence score indicating the certainty of the estimate.

App developers can use pose estimation to build immersive AR experiences, AI-powered sports and fitness coaches, and gesture-based user interfaces.

With Fritz AI, you can now add pose estimation to your own app. Keep reading to see how.

Adding Pose Estimation to your app

You’ll need a Fritz AI account if you don’t already have one. Sign up here. You’ll then need to follow these instructions to initialize and configure Fritz AI for your project.

Add the dependency to your project

First, add our repository in order to download the Vision API:

Now include the dependencies in your app/build.gradle file

This includes the pose estimation model in the app. Under the hood, we use TensorFlow Lite as our mobile machine learning framework. In order to make sure that the model isn’t compressed when the APK is built, you’ll need to add the following in the same build file under the android option.

Create a new pose estimation model

Initialize the model and set some sensitivity parameters. A full list of options can be found here.

import ai.fritz.vision.FritzVision;
import ai.fritz.poseestimationmodel.PoseEstimationOnDeviceModel;
import ai.fritz.vision.poseestimation.FritzVisionPosePredictor;
import ai.fritz.core.FritzOnDeviceModel;
// ...

public class CameraActivity extends Activity implements ImageReader.OnImageAvailableListener {

    private FritzVisionPosePredictor posePredictor;
    private FritzVisionPoseResult poseResult;
    // A sensitivity parameters we'll use later.
    private float minPoseThreshold = 0.6f; 

    public void onCreate(final Bundle savedInstanceState) {

        Intent callingIntent = getIntent();
        // Set some sensitivity options
        FritzVisionPosePredictorOptions options = new FritzVisionPosePredictorOptions.Builder()

        // Initialize the model and predictor.
        FritzOnDeviceModel onDeviceModel = new PoseEstimationOnDeviceModel();
        posePredictor = FritzVision.PoseEstimation.getPredictor(onDeviceModel, options);
        // The rest of your onCreate function...
  // The rest of your Activity...

Connect the model to the camera

The pose estimation model takes a single image as input. Images can come from a camera, a photo roll, or live video. In video, each frame is passed to the model individually and the prediction result is turned into a pose object. We’ve added a convenient drawPose function to overlay skeletons on top of the original image or video.

public class CameraActivity extends Activity implements ImageReader.OnImageAvailableListener {
  // The rest of your activity
  public void onImageAvailable(final ImageReader reader) {
    // The FritzVisionImage class makes it easy to manipulate images used as model inputs.
    Image image = reader.acquireLatestImage();
    final FritzVisionImage fritzVisionImage = FritzVisionImage.fromMediaImage(image, imageRotation);
    // Run the model to find poses in the image.
    poseResult = posePredictor.predict(fritzVisionImage);
    // Draw the result.
    poseResult.drawPoses(canvas, cameraSize);


The pose object also makes it easy for developers to access individual body joints, their positions, and how certain the model is in the estimate. For example, you can access the locations of arms detected by looping through the keypoints detected in the pose. Keypoints will often be detected even if they’re obscured or occluded by another object in the image. If the model didn’t find a body part because it was out of frame or the confidence score is too low, it will be excluded from keypoints.

guard let pose = poseResult.decodePose() else { return }

let leftArmParts: [PosePart] = [.leftWrist, .leftElbow, .leftShoulder]
let rightArmParts: [PosePart] = [.rightWrist, .rightElbow, .rightShoulder]

var foundLeftArm: [Keypoint] = []
var foundRightArm: [Keypoint] = []

for keypoint in pose.keypoints {
    if leftArmParts.contains(keypoint.part) {
    } else if rightArmParts.contains(keypoint.part) {

That’s all it takes to add pose estimation to your app with Fritz AI!

If you need some extra inspiration, this technology was featured in Apple’s Keynote address at the 2018 iPhone event. Homecourt demonstrated how basketball players could see advanced analytics during practice sessions and get coaching feedback right from their phones.

Avatar photo


Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *