Machine Learning models on the edge: mobile and IoT

The wave of AI and machine learning is happening just as the dominance of mobile is becoming set in stone. As mobile devices become more ubiquitous and powerful, a lot of the machine learning tasks we think of as requiring months of high-powered compute time will be able to happen right on your phone.

This post will outline why edge devices are increasingly important, and how machine learning works with them.

Mobile and IoT are taking over the world
Training on mobile and edge devices
Inference on mobile and edge devices
Solutions for deploying machine learning models on mobile
Further Reading
Tutorials

Mobile and IoT are taking over the world

Of all of the major technology shifts over the past decade, the migration to mobile and edge devices is one of the most prominent:

63% of traffic in 2017 came from mobile, after already overtaking desktop in 2016
Digital advertisers spent more on mobile than desktop in 2017
There are already billions of IoT devices, and that number should skyrocket within the next decade

Unsurprisingly, this trend is starting to impact machine learning as well. A lot of the cool new features that we love on iPhone — like face recognition in photos and FaceID — rely on machine learning, some of which takes place on device. But running machine learning models on mobile and edge devices isn’t exactly a piece of cake.

Training on mobile and edge devices

There are two major parts of the machine learning modeling process — training and inference. Training is usually the bulk of the work: you need to teach your model how to interpret data.

The problem for mobile is that training can often be computationally heavy. Most deep learning models today train on specialized hardware like GPUs, and can take days or even months for large models or data. That’s why training usually happens in the cloud on powerful servers.

That being said, our mobile devices are getting more and more powerful. It’s not hard to imagine a future where our phones and IoT devices pack the computational punch to train (at least basic) machine learning models. There are a few really compelling benefits to training on device, too:

Models can be tailored to individual user data from an individual device
Data never needs to leave the device, which makes networking and compliance much easier (and no server costs for training!)
It’s easy to continuously update your model with new training data

Even though our phones aren’t quite at the point where we can train deep learning models fully, some applications do at least some sort of training on-device.

The easiest one to understand is Touch ID / Face ID, since Apple’s onboarding process actually walks you through the training. The predictive keyboard feature also uses on-device training, but the model is on the simpler side.

Inference on mobile and edge devices

Inference — the second part of the ML modeling process — is where mobile devices are really starting to shine. Inference typically requires much less compute power than training (it’s really just a bunch of matrix multiplication), which makes it a much more realistic task for edge devices. In fact, most of the popular mobile deployment frameworks only support inference (more on those later).

Carrying out inference on device can be very effective:

Your model doesn’t need internet access to work (!)
Moving your model to the data instead of your data to the model can greatly increase speed
No need to worry about scaling or distributed computing

On-device inference in 2018 is getting fairly popular, too. Most of the face detection that Apple deploys in the camera and photos apps happens on-device. Listening for “Hey Siri” and that handwriting recognition feature for Chinese characters are also models that are doing inference locally.

Inference on your phone is not without tradeoffs, though. Storing all of those parameters in your app — especially as deep learning models seem to be getting bigger and bigger — currently takes up a bunch of space (although Apple announced some new Core ML functionality that might help with this).

And if you want to retrain your models and update them, you’ll need to update the whole application. Finally, you’ll also need to create model versions that work for multiple devices: Apple, Android, Microsoft (they still make phones?), and others.

Navigating these tradeoffs is complex and totally dependent on what your goals are.

Solutions for deploying machine learning models on mobile

The standard frameworks that developers use for machine learning (and deep learning in particular) don’t usually have size constraints, so they’re not great for deploying models on mobile.

But over the past few years, the open source community has worked on some solutions that port models into a more realistic form factor. Combine that with strong commercial solutions, and you should be able to find something that fits your needs.

1. TensorFlow Lite (Google)

An extension of the ubiquitous TensorFlow, TensorFlow Lite is a framework for translating your models into more mobile-friendly versions. It focuses on low-latency, small model size, and fast execution. It’s still very early in the development cycle though, so you might see mixed results.

2. Caffe2Go (Facebook)

Well, this isn’t exactly a solution, since it’s not open-source (=available) yet: but according to the origin post, that’s on the roadmap. Caffe2Go is based on the popular Caffe2 framework for developing deep learning models. It stems from Facebook’s experience deploying machine learning models on mobile devices, and looks to be a promising solution whenever it’s released.

3. Core ML (Apple)

Core ML is Apple’s solution for deploying machine learning models on Apple devices: it lets you design and develop machine learning models for Apple OS apps, and then package them into the app bundle. Core ML supports conversion from many of the popular frameworks like TensorFlow Lite and Caffe2, and recently got a major performance upgrade at Apple’s 2018 WWDC.

4. ML Kit (Google)

TensorFlow Lite is Google’s framework for lightweight model deployment, but ML Kit is their hosted service for getting it deployed. ML Kit offers a few different APIs for popular use cases like Image Recognition and Natural Language Processing, and is integrated with Google’s Firebase development platform. It works on both iOS and Android, which is a benefit over Apple’s local solution.

Tutorials

Intro to Machine Learning on Android — How to convert a custom model to TensorFlow Lite (Heartbeat) — “Fast, responsive apps can now run complex machine learning models. This technological shift will usher in a new wave of app development by empowering product owners and engineers to think outside the box.”

Getting started with neural networks in iOS 10 (Prolific Interactive) — “With machine learning — specifically machine learning powered by neural networks — increasingly becoming a bigger part of many apps and iOS itself, it’s great to see Apple opening up APIs for running neural networks to third-party developers. I was thrilled to hear this news and could not wait to use iOS as an entry point to jump in and learn more about machine learning.”

Getting started with TensorFlow on iOS (Google) — “In this blog post I’ll explain the ideas behind TensorFlow, how to use it to train a simple classifier, and how to put this classifier in your iOS apps.”

Discuss this post on Hacker News.

Machine Learning models on the edge: mobile and IoT

Edge devices are becoming increasingly important—and here’s how machine learning works with them

Table of contents

Mobile and IoT are taking over the world

Training on mobile and edge devices

Inference on mobile and edge devices

Solutions for deploying machine learning models on mobile

1. TensorFlow Lite (Google)

2. Caffe2Go (Facebook)

3. Core ML (Apple)

4. ML Kit (Google)

Further Reading

Tutorials

Fritz

Comments 0 Responses

Leave a Reply Cancel reply