Using Google Cloud AutoML Multi-Label Image Classification Models in Python

Running ML inference on the edge

This is the ninth post in my series around training and running Cloud AutoML models on the edge. This post follows up from the post earlier on training a multi-label image classification model and covers how to run the trained model in a python environment.

Step 1: Exporting the trained model

Once a model has finished training, you can head over to Google Cloud and export the model to use it locally. You can do so by navigating to the “Test and Use” section of the dataset and choosing the option to export to a Docker container:

Doing so will result in a .pb file that you can download and use locally.

Name the downloaded file as saved_model.pb and move it into a folder. We’ll be needing this folder later on.

Step 2: Installing the required dependencies

Before we go ahead and write any code, it’s important that we first have all the required dependencies installed on our development machine.

For the current example, these are the dependencies we’ll need:

We can use pip to install these dependencies with the following command:

Step 3: Setting up the Python code for using the model

Now that we have the model and our development environment ready, the next step is to create a Python snippet that allows us to load this model and perform inference with it.

Here’s what such a snippet might look like:

import pathlib
import cv2
import tensorflow.compat.v1 as tf

# path to the folder containing our downloaded .pb file
model_path = '/Users/harshithdwivedi/Downloads/downloaded_model'

# creating a tensorflow session (we will be using this to make our predictions later)
session = tf.Session(graph=tf.Graph())

# loading the model into our session
tf.saved_model.loader.load(session, ['serve'], model_path)

Try running the script above with the command python main.py, and, if you don’t get any errors, you’re good to go!

Up next, we’ll be modifying the code above to read images from the local disk and load them into our ML model.

Step 4: Reading and providing input to the ML model

In this example, I’ll be using a model that I trained earlier, which tells me what items are present in the provided image. I have trained the model to classify Wedding Dress, Cake, Ring and Flowers in an image:

To test if the model works as expected, I’ve placed a few images in a folder, and I’ll be passing the path of this folder and reading files from it in my code. Here’s how it can be done:

import pathlib
import cv2
import tensorflow.compat.v1 as tf

# path to the folder containing our downloaded .pb file
model_path = '/Users/harshithdwivedi/Downloads/downloaded_model'

# creating a tensorflow session (we will be using this to make our predictions later)
session = tf.Session(graph=tf.Graph())

# loading the model into our session
tf.saved_model.loader.load(session, ['serve'], model_path)

# folder containing the images
source_dir = pathlib.Path("/Users/harshithdwivedi/Downloads/Dave and Amanda")

for file in source_dir.iterdir():
    try:
        # get the current image path
        img_path = str(file.resolve())
        
        # image bytes since this is what the ML model needs as its input
        binary_img =  open(img_path, 'rb').read()
        
        # pass the image as input to the ML model and get the result
        result = session.run('scores:0', feed_dict={'Placeholder:0': [binary_img]})[0]
        
        print("File {} has result {}".format(file.stem, result))
    except:
      pass

The code might look complex, but it’s actually not! Let’s break it down to see what’s happening here:

Lines 1–15: Initialization — discussed earlier.

Line 17: Here, we’re iterating through the directory in which the images are placed using pathlib.

Line 18–23: For each image, we convert it to a byte array, which is a format TensorFlow understands.

Line 25–28: For each byte array, we pass it to the session variable and get the output. scores:0 is the node in the model that stores the prediction scores, i.e. the output. Whereas the Placeholder:0 node is what stores the input.

Running the code above, we get the following results:

Each of these results is an array with four elements, each of which is the probability that a particular label exists in the image.

If we look closely at our dict.txt that we get from exporting the model, this is what its contents look like:

What this means is that the first element in the result array tells us the probability of the image containing a cake, the second element for dress, the third for flower and so on …

Step 5: Formatting the results obtained with TensorFlow

Now that we know what the result looks like and what it means, we can tweak the output accordingly. For the use case of this blog, I’ll simply print out the items detected in the image along with the name of the image.

Note: I’ll only consider the existence of an item if its probability is more than 0.6

Modifying the code above, this is what the resulting python code looks like:

import pathlib
import cv2
import tensorflow.compat.v1 as tf

# path to the folder containing our downloaded .pb file
model_path = '/Users/harshithdwivedi/Downloads/downloaded_model'

# creating a tensorflow session (we will be using this to make our predictions later)
session = tf.Session(graph=tf.Graph())

# loading the model into our session
tf.saved_model.loader.load(session, ['serve'], model_path)

# folder containing the images
source_dir = pathlib.Path("/Users/harshithdwivedi/Downloads/Dave and Amanda")

for file in source_dir.iterdir():
    try:
        # get the current image path
        img_path = str(file.resolve())
        
        # image bytes since this is what the ML model needs as its input
        binary_img =  open(img_path, 'rb').read()
        
        # pass the image as input to the ML model and get the result
        result = session.run('scores:0', feed_dict={'Placeholder:0': [binary_img]})[0]
        
        prob_cake = result[0]
        prob_dress = result[1]
        prob_flower = result[2]
        prob_ring = result[3]
        
        items = []
        
        if prob_cake > 0.6: 
          items.append("cake")
        if prob_dress > 0.6:
          items.append("dress")
        if prob_flower > 0.6:
          items.append("flower")
        if prob_ring > 0.6:
          items.append("ring")
          
        print("File {} has result {}".format(file.stem, items))
    except:
      pass

Running the file above, this is what we see:

As we can see above, some files have more than one items, some files have just one item, and some files have no items at all!

And that’s it! We can further extend this code snippet to display the images the model is detecting using OpenCV and then moving the images to different folders based on their labels.

If you’re interested to learn more about what we’re working on and would like to join us in our journey; be sure to check us out at:

Thanks for reading! If you enjoyed this story, please click the 👏 button and share it to help others find it! Feel free to leave a comment below.

Have feedback? Let’s connect on Twitter.

Avatar photo

Fritz

Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *