We’ve seen a bunch of great use cases for on-device ML, but sometimes it’s easy for us to get a bit lost in the weeds. For many mobile developers, the idea of machine learning can sound scary—math, a steep learning curve, and questions about the need for machine learning features in the first place.
Of course, not every use case is suitable for in-app deployment, but we wanted to take a step back and look at how on-device ML can benefit developers and their apps, examine relevant use cases, and explore what the developer environment actually looks like for mobile engineers embarking on their machine learning journey.
Why would developers choose to put ML in their apps?
Without a more full understanding of the potential benefits of on-device ML, it could seem like sticking with the backend handling all the ML heavy lifting makes the most sense. But there are a few distinct advantages to choosing on-device inference.
Low Latency
If you’re doing anything in real-time, like manipulating live video, you’re not going to have network speeds high enough to transmit live video from your device, up to the cloud, then back to the device again.
Snapchat is probably the most obvious example here, where they use neural networks for face landmark detection to anchor AR masks. It would be a pretty terrible experience if you needed to wait a few minutes for each filter to process before you could see the result.
Another great example of this is u/hwoolery’s app InstaSaber. He trained a custom ML model to detect and estimate the position of a rolled up piece of paper to turn it into a light saber. These types of AR interfaces are only possible if the processing is done on-device.
Low / No Connectivity
If you don’t have access to the Internet, your app isn’t going to work. This isn’t too big of a problem in many major cities, but what about in rural settings? The team at PlantVillage trained a neural network that can detect different types of crop diseases impacting places in east Africa. Farmers don’t have great connectivity, though, and data costs are very expensive. The solution was to put the model in the app and run it on-device. You can read more about this here:
Privacy
Machine learning can do amazing things, but it needs data. What if that data is sensitive? Would you feel comfortable sending a 3D scan of your face to Apple? They don’t think you should—which is why FaceID works entirely on device. A neural network takes the 3D scan, embeds it into a vector, stores that in the secure enclave, and then future scans are compared to that. No facial data every gets sent to Apple. Similar issues come up all the time in health care. For example, MDAcne is an app that tracks skin health over time with computer vision algorithms running right on the phone.
Cost
Services like AWS and GCP are pretty cheap, provided you’re using the lower end CPU / Memory configurations. But ML workloads often demand GPUs and lots of memory, even just for inference. This can get expensive really fast. By offloading processing to a user’s device, you also offload the compute and bandwidth costs you’d incur if you were to maintain that backend.
Mobile ML Use Cases
In general I think you can break down use cases into the following categories:
Creativity Tools
These are ML-powered features in your app that allow users to create great content (photos, video, etc.) I already talked about apps like Snapchat that use on-device ML to power AR effects. A few other examples:
- Momento Gifs — They use image segmentation in real-time to blend AR effects around people in scenes. On-device makes the real-time part of this viable.
- Prisma — Artistic style transfer (there are a bunch of apps like this).
- Octi— Image segmentation and pose estimation to track people in videos for fun effects.
- Panda — AR masks like Snapchat, but with the added ability to hear words via speech recognition. Effects are triggered when words are spoken.
- Meitu — Uses on-device ML to power a number of real-time beautification features
Core UI / UX
- Superimpose X — Also uses image segmentation to power an auto-masking tool, so users don’t have to cut out people by hand.
- Subreddit Suggester — An example that uses Create ML to train a model that automatically pics a subreddit from your post title so you don’t have to click so many buttons to submit a new post.
- Homecourt — Featured on stage at the latest Apple iPhone event. Uses pose detection and some other ML models to automatically track analytics for basketball. I think we’ll see apps like this pop up for many sports.
- PlantVillage, as mentioned above.
- Facebook has said they ship their own run time in every FB app bundle. They use models to do everything from fine tuning newsfeed rankings on-device to analyzing photos.
- Polarr — Uses on-device ML to help with photo composition, editing, and organization in real-time.
Privacy
- FaceID
- MDAcne, as mentioned above.
What’s it like to develop with on-device ML?
It’s still a bit tricky, but it’s getting easier! Here’s a look at developer environments for both iOS and Android.
iOS
On iOS, you’re going to have a much better time because there are higher-level tools, and the performance of models on-device is WAY better thanks to Apple’s chip advantage. To get started, take a look at:
- Create ML — Apple introduced a high level Core ML training tool in Swift Playgrounds this year. It’s essentially drag-and-drop ML. Not as much flexibility, but if you want to do something like text classification or image classification, it’s very easy to get started.
- Turi Create — A Python-based, high-level training framework maintained by Apple. It’s more flexible than Create ML, but you need to know Python.
Android
- ML Kit — Part of Google’s Firebase product, ML Kit offers pre-trained models for things like face detection and OCR. They also offer the ability to create hybrid systems where some server-side and on-device models are usable via the same API. Technically ML Kit is cross-platform, but performance on iOS is poor at best.
- TensorFlow Lite / Mobile — If you’re feeling up to it, you can convert raw TensorFlow models for use on-device. TensorFlow Lite is a pared down, mobile-optimized runtime to execute models in iOS and Android apps. For now, though, there’s limited support for many operations, and performance is severely lacking. Many Android deployments still use the deprecated TensorFlow Mobile runtime.
Other
- Keras — A popular Python-based framework for deep learning, most often used with a TensorFlow backend. I recommend Keras over TensorFlow directly because it’s simpler, and coremltools has the easiest time converting Keras models to Core ML. TensorFlow models can be extracted for conversion to TFLite and TFMobile as well.
- QNNPACK — A mobile-optimized runtime developed and used by Facebook. Quantized operations written in low-level languages provide fast, CPU-only performance. QNNPACK is compatible with Caffe2 and PyTorch models and ships to every Facebook user via their mobile app.
- ONNX — ONNX began as an open source, standardized protocol for defining the structure and parameters of neural networks. Microsoft has now added an inference runtime as well.
I hope you find this overview helpful! We’d love to hear your thoughts about on-device ML: the good, the bad, the skeptical, etc.
Discuss this post on Hacker News
Comments 0 Responses