AutoML Vision Edge: Exporting and Loading TensorFlow SavedModels with Python

This post is part 2 in the series on Google Cloud AutoML Vision Edge. In the previous post, we saw how we can train an edge-ready TensorFlow Lite model with AutoML, from scratch. TFLite models take up less storage space but also are a bit less accurate in comparison with a TF SavedModel format.

This post will help you to export and load TensorFlow SavedModel formats provided by AutoML using Python.

1. What is the TensorFlow saved model?

A TensorFlow SavedModel is an independent program that doesn’t require the original model to run. It has the weights and computations all included in the TensorFlow program itself and can be deployed anywhere — including, in our case, edge devices.

1.1 Exporting TF SavedModel format

This step is the simplest one in the process. Once you have trained the model as you have done in the previous post. You can choose the Container card from the Test tab as shown here:

After choosing a Container, you should create a folder in your bucket and click Export button.

1.2 Loading TF Saved Model Format

Loading saved AutoML TensorFlow models isn’t a piece of cake with the provided documentation. The documentation suggests installing and running a docker container and making a POST request to the container to predict the results on the image.

While Docker makes the setup pretty easy, it has it’s own limitations, like speed, extra dependencies, security, etc. You can learn more about these issues here.

Loading the model using Python code gives us more flexibility over the model. To load the model, we just need to download it from the Google Cloud Storage bucket into our local system. After downloading it from GCS bucket, you will be able to see a saved_model.pb file.

The popular saved_model.pb file stores the actual TensorFlow program, or model, and a set of named signatures, each identifying a function that accepts tensor inputs and produces tensor outputs.

While working on a project last week, I searched the entire internet to try to figure out how to simply (or not so simply, as it turned out) load the .pb file using Python to perform inference. All blogs, all documentation, and a long search process on StackOverflow :/

This twenty minute snippet of my search history says it all about this frustration:

This post will help you with step-by-step instructions for loading .pb machine-learning model files in Python.

1.3 Why is it difficult to load?

This is where things get tricky. After working for hours with the same issue, here’s my conclusion:

1. The TensorFlow documentation is helpful if you want to serve the .pb model using TensorFlow Serving, which is a way to get predictions by hosting the models on a server running into your local system. But it’s not helpful if you don’t want to use the extra overhead (i.e TensorFlow Serving).

2. While the TensorFlow documentation makes it tough, the model or the file that we get by exporting it from AutoML doesn’t provide any information about the input and output nodes of the model.

1.4 How to load .pb files with Python to predict?

This should be relatively simple if you do it step-by-step.

Step 1: Use Netron to get input and output nodes

Netron is a website where you can upload any TensorFlow model to get its architecture. Once, we get the architecture, it’s easier to get the input and output nodes.

Once you get your .pb file from AutoML, upload it to Netron to get the model’s visualization.

Once you upload your model file, you’ll get the architecture, which should look something like this:

Clicking on any node will specify the type of input it will take, along with the input node name.

Step 2: Installing packages

You have to install tensorflow and include that in your Python script.

Step 3: Loading Model

From Netron, we obtained the following details about the model:

INPUT_NODE : normalized_input_image_tensor



The [?,] shape indicates a single tuple, which here represents the binary of the image inside an array as a single element.

The following code contains the export_path variable, which is the path of the folder containing the .pb file. The Session is the class that encapsulates the environment in which TensorFlow Graph Operation objects are executed and Tensor objects are evaluated.

We pass the graph in the constructor tf.Session and then load the model using the .load method and passing our export_path.

To predict the output and providing input, we have to replace OUTPUT_NODE, INPUT_NODE, and IMAGE_SHAPE with their values. Finally, the y_pred value will provide you with the prediction for the image.

The following code will predict the image from the TensorFlow SavedModel using Python.

Step 4: Debugging

In case Netron fails to provide you the output nodes, you can print all the output node names with the function provided below and can also try them one-by-one to check and see if you get the desired output.


Though there are a lot of resources on the web to solve the above problem, none of them seems complete, and it can take a significant amount of time to figure out the solution.

This series will focus on AutoML for Edge devices, covering all possible model formats. In the last post, we covered the TF Lite format. This post covered loading and exporting TF SavedModel formats, available as a .pb file using Python.

In upcoming post in this series, we’ll use the TF.js format provided by AutoML Vision Edge to run our models inside the browser and on the server.

If you liked the article, please clap your heart out. Tip — Your 50 claps will make my day!

Want to know more about me? Please check out my website. If you’d like to get updates, follow me on Twitter and Medium. If anything isn’t clear or you want to point out something, please comment down below.

Leave a Reply

Your email address will not be published. Required fields are marked *

Excited? Us too.

Let’s get moving