What’s new in Core ML 3

Though it didn’t get a ton of stage time during the keynote at this year’s WWDC, there’s a lot to be excited about in the latest iteration of Apple’s machine learning framework, Core ML 3.

In this post, I’ll highlight the biggest changes to the software and discuss their implications for developers and machine learning engineers.

On-device training is here

The biggest addition to Core ML is the introduction of on-device training. Prior to this release, Core ML supported inference only.

Training was done server-side with a framework like TensorFlow or PyTorch, and models were converted to Core ML in order to make predictions in an app. Core ML 3 changes that, becoming the first widely available machine learning framework to support both inference and training directly on-device.

Training is accomplished with a combination of mlmodel attributes and API calls. An isUpdatable flag can be added to models and individual layers to denote where parameter updates can be made.

Definitions for training inputs, outputs, and loss functions, as well as hyperparameters like learning rate, can be set on the mlmodel’s proto file.

Developers can use the MLUpdateTask to perform training itself. Apple provides MLUpdateProgressHandlers that act like Keras callbacks, allowing arbitrary code to be run at various times during training, such as when an epoch ends. Because some updates may take longer than a few seconds, Core ML tasks can now be run in the background for a short period of time.

Apple telegraphed their intended use case for on-device training by talking almost exclusively about “model personalization”. Rather than using a one-size-fits-all model for everyone, developers are encouraged to tailor models and experiences to individual users.

In a demo during the Core ML breakout session, an Apple engineer created a shortcut feature for teachers using a paper grading app. A teacher marking an assignment with an Apple pencil could define custom sketches that would be recognized by a model and transformed into emoji stickers.

New shortcuts were made by drawing a few example sketches, which were used to train a small k-nearest neighbors model to predict the desired emoji to output.

While model personalization is one use case for on-device training, it’s not the only one. Recent research into distributed training techniques like federated learning have demonstrated the potential to train accurate models from scratch using small updates from many devices. Core ML 3 opens the door to implement these approaches to run on Apple devices and their powerful hardware.

Training models on-device has major benefits to both users and developers. Users get more personal experiences that improve over time. And because everything runs on-device, it works with or without internet connectivity.

Most importantly, though, user data stays safely on-device and is never transferred to a third party. Developers don’t need to manage large analytics clusters for training models or deal with transferring and securing training data.

Realizing the benefits of on-device training also requires some planning and technical choices. Mobile devices won’t have the same compute or memory resources as large cloud clusters, so it’s important to think carefully about what to update and when.

Rather than updating weights for every layer in a large model, Core ML 3 allows specific layers to be tagged as updatable. On-device training architectures should make use of global, static feature extractors and smaller, updatable blocks at the top of models for personalization.

New layers, more possibilities

In addition to on-device training, Core ML 3 also brings support for a number of new architectures, layer types, and operations that open the door for complex models and use cases. These updates aren’t always flashy, but they make a huge difference in the framework’s utility.

New models introduced in Core ML 3 include:

NearestNeighbors.proto — Nearest neighbors classifiers (kNN) are simple, efficient models for labeling and clustering that work great for model personalization.
ItemSimilarityRecommender.proto — Tree-based models that take a list of items and scores to predict similarity scores between all items that can be used to make recommendations.
SoundAnalysisPreprocessing.proto — Built-in operations to perform common pre-processing tasks on audio. For example, transforming waveforms into the frequency domain.
LinkedModels.proto — Shared backbones and feature extractors that can be reused across multiple models.

These new models enable use cases beyond computer vision and provide additional benefits for developers using multiple models in the same application.

Core ML 3 officially supports over 100 neural network layer types. As an appendix to this article, I’ve included a comprehensive list of new layers added to the NeuralNetwork.proto file. You can guess what most of them do from their names, but for detailed descriptions, check out this fantastic deep dive by Matthijs Hollemans.

The most exciting changes are support for many more NumPy-like operations for manipulating MLMultiArrays.

This will make it much easier to port complex pre- and post-processing into mobile apps. Core ML also now supports dynamic graphs, including loops and branchings.

The upshot of all this hard work by the Core ML team is that most of the state-of-the-art models making headlines over the past year are now fully compatible with Core ML.

Many of these architectures still need to be shrunk and optimized for mobile use (the weights for a fully trained BERT model can be over 1GB in size), but it’s exciting to see so many possibilities available.

Finally, the addition of new layers means that conversion tools have also gotten more robust. Converting models from Keras, TensorFlow, and PyTorch should be a much smoother process with fewer custom workarounds.

Implications

This release marks a major step forward for machine learning in the Apple ecosystem, and there are a number of implications for developers.

Core ML is ready to move beyond computer vision. Image-related tasks have dominated deep learning, and specifically mobile deep learning, for a few years now. Support for audio-preprocessing, generic recommender models, and complex operations required for this year’s crop of NLP models promises to change that. Developers should start thinking about ML-powered experiences for users that go beyond the camera.
On-device training will demand new UX and design patterns. How much data is needed to personalize a model with sufficient accuracy? What’s the best way to solicit training data from users? How often should this be done? As ML moves closer and closer to core application logic, developers need to think about how these features are communicated to users.
Personalized models will need persistence and syncing. Training data for personalized models will remain on-device, but what about the model itself may need to be stored elsewhere? If a user deletes then re-installs an app or wants to use the same app on multiple devices, their personalization should go with them. Developers will need systems to back up and sync models.
It’s now possible to do end-to-end machine learning and skip Python. Python has been the preferred programming language of ML engineers for nearly a decade now. With the ability to train models, Core ML + Swift is now a viable alternative for some projects. Will mobile developers skip Python entirely and opt for a language they already know? Time will tell.

Resources

For more information on Core ML 3, check out the following resources.

Appendix — New Core ML 3 Layers

Control Flow Layers:

CopyLayer
BranchLayer
LoopLayer
LoopBreakLayer
LoopContinueLayer
RangeStaticLayer
RangeDynamicLayer

Elementwise Unary Layers:

ClipLayer
CeilLayer
FloorLayer
SignLayer
RoundLayer
Exp2Layer
SinLayer
CosLayer
TanLayer
AsinLayer
AcosLayer
AtanLayer
SinhLayer
CoshLayer
TanhLayer
AsinhLayer
AcoshLayer
AtanhLayer
ErfLayer
GeluLayer

Elementwise Binary with Broadcasting Support

EqualLayer
NotEqualLayer
LessThanLayer
LessEqualLayer
GreaterThanLayer
GreaterEqualLayer
LogicalOrLayer
LogicalXorLayer
LogicalNotLayer
LogicalAndLayer
ModBroadcastableLayer
MinBroadcastableLayer
MaxBroadcastableLayer
AddBroadcastableLayer
PowBroadcastableLayer
DivideBroadcastableLayer
FloorDivBroadcastableLayer
MultiplyBroadcastableLayer
SubtractBroadcastableLayer

Tensor Manipulations

TileLayer
StackLayer
GatherLayer
ScatterLayer
GatherNDLayer
ScatterNDLayer
SoftmaxNDLayer
GatherAlongAxisLayer
ScatterAlongAxisLayer
ReverseLayer
ReverseSeqLayer
SplitNDLayer
ConcatNDLayer
TransposeLayer
SliceStaticLayer
SliceDynamicLayer
SlidingWindowsLayer
TopKLayer
ArgMinLayer
ArgMaxLayer
EmbeddingNDLayer
BatchedMatMulLayer

Tensor Allocation / Reshape sort of operations

GetShapeLayer
LoadConstantNDLayer
FillLikeLayer
FillStaticLayer
FillDynamicLayer
BroadcastToLikeLayer
BroadcastToStaticLayer
BroadcastToDynamicLayer
SqueezeLayer
ExpandDimsLayer
FlattenTo2DLayer
ReshapeLikeLayer
ReshapeStaticLayer
ReshapeDynamicLayer
RankPreservingReshapeLayer

Random Distributions

RandomNormalLikeLayer
RandomNormalStaticLayer
RandomNormalDynamicLayer
RandomUniformLikeLayer
RandomUniformStaticLayer
RandomUniformDynamicLayer
RandomBernoulliLikeLayer
RandomBernoulliStaticLayer
RandomBernoulliDynamicLayer
CategoricalDistributionLayer

Reduction related Layers:

ReduceL1Layer
ReduceL2Layer
ReduceMaxLayer
ReduceMinLayer
ReduceSumLayer
ReduceProdLayer
ReduceMeanLayer
ReduceLogSumLayer
ReduceSumSquareLayer
ReduceLogSumExpLayer

Masking / Selection Layers

WhereNonZeroLayer
MatrixBandPartLayer
LowerTriangularLayer
UpperTriangularLayer
WhereBroadcastableLayer

Normalization Layers

LayerNormalizationLayer

What’s new in Core ML 3

On-device training is here

New layers, more possibilities

Implications

Resources

Appendix — New Core ML 3 Layers

Fritz

Comments 0 Responses

Leave a Reply Cancel reply