How to detect a “thumbs-up” in the browser with TensorFlow.js

In this tutorial, we’ll continue learning the various use-cases of the TensorFlow.js library. Our previous tutorial used this library for real-time human pose estimation. Here, we’re going to detect hand gestures using the library.

In this tutorial, we’re going to focus our pose algorithm on a smaller area—human hands. We are going to detect the hand poses and gestures using the TensorFlow.js library. Like in the previous tutorial, we are going to make use of a webcam for gesture detection and canvas for drawing or displaying the result of the detection.

What we’ll cover in this tutorial

Creating a canvas to stream video from a webcam.
Detecting hand poses using a pre-trained hand pose model.
Determining a thumbs-up gesture.

Let’s get started!

Setting up dependencies

First, we need to install the necessary dependencies in our project. The dependencies we need to install in our project are the posenet model, tfjs TensorFlow, and the react-webcam. We can use either npm or yarn to install the dependencies by running the following commands in our project terminal:

@tensorflow/tfjs: The core TensorFlow package based on JavaScript.
@tensorflow-models/handpose: This package delivers the hand pose TensorFlow model.
react-webcam: This library component enables access to your machine’s webcam in the React project.

Next, we need to import all the installed dependencies into our App.js file, as directed in the code snippet below:

Setup webcam and canvas

Next, we’re going to setup our webcam and canvas to view the webcam stream in the web display. For that, we’re going to make use of the Webcam component that we installed and imported earlier. First, we need to create reference variables using the useRef hook, as shown in the code snippet below:

Next, we need to initialize the Webcam component in our render method. Using this, we can stream the webcam feed in the canvas, also passing the refs as prop properties. The coding implementation is provided in the code snippet below:

Now, we need to add the canvas component just below the Webcam component. The canvas component enables us to draw anything that we want to display in the webcam feed. The canvas component, with its prop configurations, are provided below:

The style for both the Webcam and canvas components is provided in the code snippet below:

Loading the Hand Pose model

In this step, we’re going to create a function called runHandpose, which initializes the hand pose model using the load method from the handpose module. The overall code for this function is provided in the code snippet below:

In order to load the hand pose model upon starting the app, we’re going to call it inside the useEffect hook, as shown in the code snippet below:

Detect Hand Pose

Here, we’re going to create a function called detect, which will handle the hand pose detection. First, we detect the webcam and grab the video properties to handle the video adjustments, as directed in the code snippets below:

We then get and set the video properties using webcamRef that we defined earlier:

Then, we need to set the canvas width and height based on the dimensions of the video:

Then, we start estimating the hand pose using the estimateHands method provided by the hand module that takes video as a parameter, as shown in the code snippet below:

Next, we need to call this detect function inside the runHandpose method under the setInterval method. This enables the detect function to run every 10 milliseconds. The coding implementation is provided in the code snippet below:

Detecting thumbs-up

Now, it’s time to detect the thumbs up hand gesture. For that, we’re going to make use of the fingerpose library.

Adding States

First, we’re going to define a state using the useState hook, which will enable us to handle the thumbs up image status as directed in the code snippet below:

Import emoji image and fingerpose library

Then, we need to import the thumbs-up emoji image and the fingerpose library as fp as shown in the code snippet below:

Update the detect function

Next, we need to update our detect function with a gesture detecting function. We’re going to make use of the GestureEstimator method from the fingerpose package in order to detect hand gestures.

We apply the gesture mapping as well as the confidence index to detect the accurate gesture and set the emoji. The overall coding implementation is provided in the code snippet below:

Add Emoji Display to the Screen

Lastly, we need to add the emoji image to the display to signify a hand pose detection result. For that, we’re going to use conditional rendering, as directed in the code snippet below:

Now, if we run the project in our browser, we’ll get the result as displayed in the demo below:

Here, we notice the webcam on the right side and the drawing canvas on the left side. When we show the thumbs-up hand gesture, it immediately detects the gesture and displays the thumbs-up emoji.

Hence, we have successfully implemented the real-time hand pose estimation using the TensorFlow model and webcam feed in our React project.

Conclusion

Gesture detections of human body parts are highly used in AI, robotics, and gaming industries. Many pose estimation TensorFlow models are available—in this tutorial, we learned how to use handpose and fingerpose libraries in order to detect the thumbs-up gesture and display the thumbs-up emoji as the result.

Now, the challenge becomes to make use of these packages and configurations to detect other hand poses, such as rock, paper, scissors, etc. You can also use other advanced gesture control techniques and libraries. For the demo of the entire project, you can check out the Codesandbox.