The TensorFlow Dev Summit 2018 was full of exciting announcements and great tech talks. There’s a lot to digest and you can see the full live stream below.
For this post though, I’ve pulled out just the bits I think are most relevant to edge computing and mobile development using TensorFlow.
1. Swift for TensorFlow
In a few years, I believe we’ll look back at Chris Lattner’s talk on Swift for TensorFlow as a major inflection point not just for mobile machine learning, but for machine learning and data science in general.
It’s tempting to interpret this project along the lines of other language support. Just some wrapper around the same core C++ libraries that make it easier to access TF operations in a new language. It’s so much more. In his announcement, Chris and the Swift for TensorFlow team made it clear that they see Swift as the future of data science and machine learning development…on every platform. They are rebuilding TensorFlow in Swift from the ground up, taking advantage of static typing and its compiler to provide a better imperative programming experience and performance across platforms.
Yes, this will make it easier to put TensorFlow in your iOS apps, but this project aims to take on Python as the main scientific language for building and training models. More to come on this.
With all the focus on mobile phones and edge devices like the Raspberry Pi, it’s easy to forget that web apps still have a massive user base. TensorFlow.js uses WebGL to make hardware accelerated machine learning possible directly from your browser. The project itself owes its foundation to deeplearn.js and will make machine learning and AI accessible to a huge segment of frontend developers that may have been held back by the need to learn a new language.
3. TensorFlow Lite
The first talk on the TensorFlow Dev Summit live stream was from the TensorFlow Lite team. They announced a few new features and gave some insight into their roadmap going forward. The first is support for quantization in training. Quantizing weights in a TensorFlow models (changing floating point parameters to integer representations) both compresses models and speeds up runtime. However, it can reduce the accuracy of models in certain cases because the model used for inference doesn’t match the one you trained with. Now, you’ll be able to simulate quantization during training, which the TF Lite team claims brings accuracy back to non-quantized levels in some cases.
The second major piece of the TensorFlow Lite talk was their roadmap for hardware acceleration. On Android, TF Lite takes advantage of the Neural Networks API which will run ops on GPUs or other AI-specific SOCs when available. On iOS, though, you’re currently limited to the CPU through C++ routines. You could convert your models to Core ML, but in the future, the TF Lite team will be implementing ops in Metal, Apple’s hardware acceleration framework.
The team is also working on on-device training for future releases which is particularly relevant given the public’s renewed attention to privacy. Lastly, TensorFlow Mobile is officially dead and all future efforts are being focused on TensorFlow Lite.
4. TensorFlow Hub
AI and machine learning are changing the way software is developed. Instead of writing thousands of lines of procedural code, a developer can train a neural network with tens of millions of parameters. This has implications for the way software engineers do their jobs. You can’t find a neural network on Stack Overflow, tweak a few parameters, and repurpose it for your own custom use case. So much of software engineering is composing and remixing existing code, so how does this work when logic is stored in huge matrices of network weights?
TensorFlow Hub is an interesting glimpse at what the process of software development might look like in the future. It is a giant database of frozen TensorFlow graphs that anyone can download and insert into their own code. For example, consider trying to create a feature like Apple’s Face ID. A system like this uses a neural network to embed an image of a face into a high dimensional space where it can be compared to other images of faces to measure similarity. The embedding is learned by training a network on a huge dataset of faces.
With a service like TensorFlow Hub, you can simply download the piece of the trained TensorFlow graph that does the embedding and build your feature around it. Developers can browse for bits and pieces of machine learning pipelines to mix, match, and recombine. In the same way we’re moving to a serverless architecture for many cloud-based applications, this might be a new programming paradigm where bits and pieces of graphs are downloaded and run rather than building and training everything from scratch.
It’s an incredibly interesting time to be a software developer!