The Benefits of On-Device ML

Mobile machine learning delivers much-needed benefits

If you’ve been following the progress of artificial intelligence and machine learning in recent years, you’ve almost assuredly learned about its groundbreaking potential, its rapid adoption across industries, and the sweeping investments being made in both research and application.

But there’s one particular area of this burgeoning industry that, despite surges of research and focused investment from major players like Apple and Google, continues to fly somewhat under the radar: mobile machine learning. Or, more specifically—mobile applications that leverage machine learning models (computer vision, natural language processing) to facilitate immersive and powerful features and user experiences.

If we zoom out a bit, we can see this industry shift with more clarity. AI tasks that were once only practical in the cloud are now being pushed to the edge as mobile devices become more powerful, with specialized AI hardware and capabilities.

This raises an important question, however: Why would we actually want to perform machine learning tasks on these devices when cloud servers are already so capable? In this guide, we’ll cover these four primary benefits:

These benefits get at the core of what both consumers and businesses desire and need from machine learning:

  • Consumers want and expect seamless, personalized experiences that work without interruption, often in real-time.
  • Businesses want to streamline processes across departments, reduce costs, and transform their operations for increased efficiency.

Mobile machine learning is uniquely positioned to deliver these benefits to both groups. This is especially true given that, the number of edge and mobile devices running AI inference on-device is expected to reach 3.7 billion by the end of 2020 [PDF].

Mobile developers have a lot to gain from revolutionary changes that on-device machine learning can offer. This is because of the technology’s ability to bolster mobile applications—namely, allowing for smoother user experiences capable of leveraging powerful features, such as providing instantaneous skin care diagnoses or instantaneously detecting plant diseases.

This rapid development of mobile machine learning has come about as a response to a number of common issues that classical machine learning has toiled with. In truth, the writing is on the wall. Future mobile apps will require faster processing speeds and lower latency.

You might be wondering why AI-first mobile applications can’t simply run inference in the cloud. For one, cloud technologies rely on central nodes (imagine a massive data center with large quantities of storage space and computing power). And such a centralized approach is incapable of handling processing speeds necessary to create smooth, ML-powered mobile experiences. Data must be processed at this centralized data center and then sent back down to the device. This takes time and money, and it’s hard to guarantee data privacy.

Having outlined these core benefits of mobile machine learning, let’s explore in more detail why, as a mobile app developer, ML engineer, or product manager, you’ll want to keep your eyes peeled for the incoming on-device ML revolution.

Part 1: Lower Latency

Mobile app developers know that high latency can be the death knell for any app, regardless of how strong its features are or how reputable the brand is. Lagging video, slow image processing speeds, and otherwise sputtering UX elements can turn an incredible idea for a mobile app into a burden for the end user.

The need to keep latency low is readily evident when considering implementing machine learning features on-device. Consider an app that tracks human movement in real-time, or one that needs to separate the foreground from its background in live video— these are app features that need to be delivered instantaneously and without lag.

In cloud-based systems, data transfer becomes a bottleneck. On-device machine learning removes the need to stream data to the cloud and back, paving the way for near-zero latency, especially as AI-accelerated hardware becomes more baked into the newest generations of smartphones.

Apple has been leading on this front, developing more advanced smartphone chips using its Bionic system, which has an integral Neural Engine that helps neural networks run directly on-device, with incredible speeds. And as Apple leads, other chip manufacturers like Samsung and Huawei have followed suit. Now Android devices like Google’s Pixel 4 also come with neural cores. As the need for on-device processing increases, we should see these major players continue to enhance these chipsets and AI-focused hardware enhancements.

Apple also continues to iterate on Core ML, its machine learning platform for mobile developers; TensorFlow Lite has added support for GPUs; and PyTorch has entered the on-device fray with PyTorch Mobile, it’s cross-platform mobile ML solution. These technologies are among the ones mobile developers can use to develop applications capable of processing data at lightning speeds, eliminating latency and reducing errors.

Part 2: Increased Security and Privacy

Another huge benefit of edge computing is how it can increase data security and privacy for users. Ensuring the protection and privacy of an app’s data is an integral part of a mobile developer’s job, especially given the need to meet the General Data Protection Regulations (GDPR) and other new privacy laws that affect mobile development practices.

Because data doesn’t need to be sent to or stored on a server or the cloud for processing, cybercriminals have fewer opportunities to exploit any vulnerabilities in this data transference and storage, thus preserving the sanctity of the data. This allows mobile developers to meet GDPR regulations on data security more easily.

In recent months, we’ve seen the pitfalls of processing sensitive mobile user data in the cloud. In the summer of 2019, FaceApp’s incredibly accurate and convincing old-age filters, which gave users a high-fidelity preview of what they might look like in 50 years, came under intense scrutiny related to user data privacy.

Despite impressive results, this powerful ML feature works by sending personally-identifying data to cloud servers for some intensive processing. Once the image leaves a user’s device, they no longer have control over sensitive biometric data that can be repurposed without knowledge or consent.

While there’s no evidence FaceApp was actively engaging in this practice, the result was a round of negative press and suspicion about the intentions behind the app. Whether these concerns were ultimately warranted or not, the fallout was swift, and the implications for data privacy clear.

On the other hand, a few months later, Snapchat released a similar gender-bending experience using their powerful on-device models. Snapchat’s on-device version of this experience didn’t expose private user data without user consent—all of the processing happened on the device itself, so images and videos not shared remained private.

Apple’s Face ID is another example of the privacy on-device AI can offer. This iPhone feature relies on an on-device neural net that gathers data on all the different ways its user’s face may look, serving as a more accurate and secure identification method.

In addition to current applications that benefit from the privacy afforded by on-device ML, there are also new techniques that are helping make on-device model training (and not just inference) possible. Federated learning is one such technique. Ground-truth data that lives locally on a user’s device is used to train a local model. These local updates are then sent to the cloud and aggregated to update a global model, which can then be passed back to users. Users get all of the benefits of ML-powered experiences while keeping data in their hands.

This insistence and focus on data security and privacy should only expand in coming years, and embedding neural networks on-device will pave the way for more secure smartphone experiences for users, offering mobile developers an additional layer of encryption to protect users’ data.

Part 3: Increased Reliability

Beyond issues with latency and privacy, sending data to the cloud to be processed for inference requires a fast and active internet connection. Oftentimes, this works just fine in more developed parts of the world with high connectivity. But what about in areas of low connectivity?

With on-device machine learning, neural networks live on the phones themselves ensuring they are always available, regardless of connectivity. This promises to democratize ML features, as users won’t need the internet to connect to their applications, which also happen to be on their personal phones, which become more widely accessible all the time.

This benefit becomes especially clear when considering applications intended to be used in rural or other remote areas with poor connectivity. Consider a rural health clinic with spotty internet access. If this clinic is working with AI-powered medical devices, which are often cheaper in the long run and help provide more efficient care, then the clinicians need to trust that these devices won’t go offline and malfunction in the middle of a procedure or diagnostic test. Running AI models on the devices themselves eliminates this concern.

On-device machine learning will ultimately democratize the technology, giving mobile developers the tools to create applications that can benefit users from all areas of the world, regardless of what their connectivity situation looks like. And even without an internet connection, given that newer smartphones are as powerful as they are, users won’t be plagued with latency issues when using an application in an offline environment.

Part 4: Reduced Costs

On-device machine learning is also designed to save you money, as you won’t have to pay external providers to implement or maintain these solutions in-house. The expensive expertise, time, and patience needed to implement and maintain end-to-end ML systems prohibits the possibility for all but the largest companies and organizations.

It should be noted that, for all intents and purposes, training neural networks for mobile still needs to happen in the cloud, given the processing and compute resources required to trai state-of-the-art architectures. Specifically, GPUs and AI-accelerated hardware are the most expensive cloud services you can purchase.

However, running model inference on-device means you don’t need to pay for these clusters when the model is running in the real world, thanks to the increasingly sophisticated Neural Processing Units (NPUs) smartphones have these days.

Avoiding the heavy, data-processing nightmare between mobile and the cloud during inference can be a huge cost-saver for businesses that choose on-device machine learning solutions, especially as they scale and need to serve hundreds of thousands or millions of active users. These bandwidth costs, saved by implementing ML on-device, can add up quickly.

Mobile developers also save greatly on the development process, as they won’t have to build and maintain additional cloud infrastructure. Instead, they can achieve more with a smaller engineering team, thus allowing them to scale their development teams more efficiently.

Additional Resources

There’s no question that the cloud has been a boon to data and computing in the 2010s, but the tech industry evolves at an exponential rate, and on-device machine learning may soon be the standard in mobile application and IoT development.

Thanks to its reduced latency, enhanced security, offline capabilities, and reduction in costs, it’s no wonder that all the major players in the industry are betting big on the technology, which will define how mobile developers approach app creation moving forward.

If you’re interested in learning more about mobile machine learning, how it works, and why it matters in the overall mobile development landscape, here are a few additional resources to help get you started: