Federated learning, transfer learning, and model personalization
For a healthcare research project I’m working on, I’m investigating for a federated learning platform that supports mobile and wearable platforms—in particular on the Apple ecosystem.
Federated learning represents a tremendous opportunity for the adoption of machine learning in many use cases, and especially where efficiency and privacy concerns require us to distribute the training process, instead of centrally collecting data on the cloud and applying traditional ML pipelines.
There are some fantastic toolkits already available—for example, TF-Federated is emerging to address this scenario—but all of these existing solutions are based on Python, and there still remains the issue of delivering a real, workable integrations with mobile and wearable devices.
I love Python, don’t get me wrong. And of course, Python is great for cloud AI, but it isn’t an option on these devices in scenarios where distributed learning is needed—like mine.
Without going too deep into specific detail about how federated learning works—in terms of distributing training and validation across devices and centrally converging on model weights instead of on data—I started simplifying (a lot) the scenario, starting from a more generic transfer learning need.
I simply started with the trivial need to pre-train a model centrally, with some shared data, and let devices continue the training with local, private data, directly on-device. Apple calls this model personalization, a new and amazing feature of the Core ML 3 framework, available on iOS 13.
The importance of having a single language to manage training pipelines both in the cloud and on-device is really important in terms of sharing a common codebase. This is true not just for the model definition itself, but also for all the pre- and post-processing data transformation and featurization code.
In this case, Swift brings with it tremendous opportunity for sharing the most code between the cloud and devices. Further, it aids in simulating some of the basic requirements for this federated learning platform, specializing cloud training using the Swift for TensorFlow toolchain, and optimizing on-device training using Core ML.
Hacking a trivial model with distributed training in Swift
In my healthcare scenario, I’m dealing with an auto-encoder model architecture to manage anomaly detection on biometric data. To simplify the technical investigation, I started with a very trivial neural network with a single layer and no activation function. Basically, a super simple linear regression trained with some noised linear data on both the Swift for TensorFlow toolchain and with Core ML on the regular Apple Swift 5.1 toolchain.
In my test, I started defining and training the model using the Swift for TensorFlow (S4TF) layer API, and then I extracted the weights and bias of the single layer of my model. Then, using a Core ML protobuf data structure, I recreated and exported the model in the Core ML format with the weights, bias, and all parameters needed for Core ML re-training / personalization.
S4TF and Core ML background
It’s out of the scope of this article to describe the status of the S4TF project and the specification of the Apple Core ML API and file format. There are already lots of great tutorials available, and also, the official documentation on both Apple’s and TensorFlow’s developer and GitHub web sites is good are good resources.
But just a couple of caveats here IMHO on the most important things to understand in this context.
First of all, S4TF is still under development and in an experimentation phase. It could be used very successfully on Linux and macOS to train models using the new, amazing Swift auto differentiation and TF backend operators, but it misses some very important functionalities at the moment; like, for example, the capability to export a model in any private or common format.
Second, it’s important to remember here that Core ML is both an abstraction file format—based on Google Protocol Buffer (protobuf) binary serialization—as well as a powerful ML runtime. It’s available on all Apple OS platforms and optimized for CPUs, GPUs, and NPUs where available for inferencing and now for re-training (or personalizing) partial or entire models.
Of course, there are some technical limitations on what Core ML supports, especially when it comes to training, considering the limits both in terms of performance and power consumption of mobile devices versus a cloud platform. But wow, this runtime is really powerful and it could be successfully used in lots of scenarios.
Core ML and TensorFlow are already very good friends, and there are different Python packages like the Apple’s CoreMLTools or Google’s TFCoreML that could be used to export (in Python) regular TF 1.x or 2.x models.
Once you’ve installed the protoc compiler from protocol buffers (Google Developers website or GitHub), I strongly suggest examining the Core ML file format by reverse engineering from an exported Core ML model using the following command:
This generated text file will reference all the nested Core ML protobuf data structures used for describing the specific models, referencing their numerical ids and the correlated values.
The corresponding definitions of the Core ML protobuf data structures (i.e. messages or enum) can be found on the official Apple CoreMLTools project on GitHub, under the mlmodel/format folder:
First step: Generate Swift Core ML classes and data structures
Now that we have some confidence with S4TF and the Core ML exporting process and file format, the first thing to do to generate a Core ML model directly from Swift is to install the official Apple Swift Protobuf plugin.
By doing this, we can generate Swift Core ML protobuf classes and data structures from the original .proto files from the above Apple CoreMLTools GitHub project. We can then save these generated Swift files in a folder that we’ll use later.
The Apple Swift Protobuf plugin’s install instructions ca be found on the official Swift Protobuf GitHub repo:
Once the plugin is installed, we can use the following command to compile the Core ML protobuf files (.proto) in Swift:
Second Step: Generate and train S4TF model, get weights and bias for layers, and export to Core ML format
We do this using the previously-generated Swift Core ML protobuf data structures, specifying the trainable parameters.
The following Swift Jupyter Notebook use Swift for TensorFlow to train a very trivial linear regression model and exports the model with all required trainable parameters using the previously-generated Swift Core ML protobuf data structures.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Install Swift Protocol Buffer library"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Installing packages:n",
"t.package(url: "https://github.com/apple/swift-protobuf.git", from: "1.6.0")n",
"ttSwiftProtobufn",
"With SwiftPM flags: ['-c', 'release']n",
"Working in: /tmp/tmprjkpbl0q/swift-installn",
"Fetching https://github.com/apple/swift-protobuf.gitn",
"Cloning https://github.com/apple/swift-protobuf.gitn",
"Resolving https://github.com/apple/swift-protobuf.git at 1.7.0n",
"[1/2] Compiling SwiftProtobuf AnyMessageStorage.swiftn",
"[2/3] Compiling jupyterInstalledPackages jupyterInstalledPackages.swiftn",
"[3/3] Linking libjupyterInstalledPackages.son",
"Initializing Swift...n",
"Installation complete!n"
]
}
],
"source": [
"%install-swiftpm-flags -c releasen",
"%install '.package(url: "https://github.com/apple/swift-protobuf.git", from: "1.6.0")' SwiftProtobufn",
"import SwiftProtobuf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Include Python, TensorFlow and enable Matplotlib plot"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('inline', 'module://ipykernel.pylab.backend_inline')n"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%include "EnableIPythonDisplay.swift"n",
"IPythonDisplay.shell.enable_matplotlib("inline")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import Pythonn",
"import TensorFlow"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"let plt = Python.import("matplotlib.pyplot")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Generate noised linear data"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"let SAMPLE_SIZE = 100n",
"n",
"let a: Float = 2.0n",
"let b: Float = 1.5n",
"let x = Tensor<Float>(rangeFrom: 0, to: 1, stride: 1.0 / Float(SAMPLE_SIZE))n",
"let noise = (Tensor<Float>(randomNormal: [SAMPLE_SIZE]) - 0.5) * 0.1n",
"let y = (a * x + b) + noise"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"Nonen"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"plt.clf()n",
"plt.plot(x.makeNumpyArray(), y.makeNumpyArray(), marker: "x")n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[100]rn",
"[100]rn"
]
}
],
"source": [
"print(x.shape)n",
"print(y.shape)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[100, 1]rn",
"[100, 1]rn"
]
}
],
"source": [
"let X = x.reshaped(toShape: [100, 1]) //SAMPLE_SIZEn",
"let Y = y.reshaped(toShape: [100, 1]) //SAMPLE_SIZEn",
"print(X.shape)n",
"print(Y.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Define Swift for TensorFlow trivial model"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"struct LinearRegression: Layer {n",
" var layer1 = Dense<Float>(inputSize: 1, outputSize: 1, activation: identity)n",
" n",
" @differentiablen",
" func callAsFunction(_ input: Tensor<Float>) -> Tensor<Float> {n",
" return layer1(input)n",
" }n",
"}n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Define Optimizer, Loss Function and Train model"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"var regression = LinearRegression()n",
"let optimizer = SGD(for: regression, learningRate: 0.03)n",
"Context.local.learningPhase = .training"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Loss: 9.029919n",
"Loss: 7.74655n",
"Loss: 6.649976n",
"Loss: 5.7129736n",
"Loss: 4.9122906n",
"Loss: 4.22806n",
"Loss: 3.6433098n",
"Loss: 3.1435456n",
"Loss: 2.7163815n",
"Loss: 2.3512392n",
"Loss: 2.0390809n",
"Loss: 1.7721862n",
"Loss: 1.54396n",
"Loss: 1.3487687n",
"Loss: 1.1817993n",
"Loss: 1.0389403n",
"Loss: 0.91667974n",
"Loss: 0.81201714n",
"Loss: 0.7223894n",
"Loss: 0.645607n",
"Loss: 0.5797997n",
"Loss: 0.5233695n",
"Loss: 0.47495145n",
"Loss: 0.43337953n",
"Loss: 0.39765763n",
"Loss: 0.36693475n",
"Loss: 0.34048393n",
"Loss: 0.31768426n",
"Loss: 0.2980051n",
"Loss: 0.28099334n",
"Loss: 0.26626182n",
"Loss: 0.25347972n",
"Loss: 0.24236454n",
"Loss: 0.23267482n",
"Loss: 0.22420436n",
"Loss: 0.21677704n",
"Loss: 0.21024229n",
"Loss: 0.20447157n",
"Loss: 0.19935498n",
"Loss: 0.19479868n",
"Loss: 0.19072248n",
"Loss: 0.18705784n",
"Loss: 0.18374622n",
"Loss: 0.18073764n",
"Loss: 0.17798929n",
"Loss: 0.1754647n",
"Loss: 0.17313263n",
"Loss: 0.17096642n",
"Loss: 0.16894329n",
"Loss: 0.16704375n",
"Loss: 0.16525108n",
"Loss: 0.16355114n",
"Loss: 0.1619317n",
"Loss: 0.16038233n",
"Loss: 0.15889415n",
"Loss: 0.15745956n",
"Loss: 0.15607202n",
"Loss: 0.15472597n",
"Loss: 0.15341662n",
"Loss: 0.15213987n",
"Loss: 0.15089223n",
"Loss: 0.14967072n",
"Loss: 0.14847274n",
"Loss: 0.14729609n",
"Loss: 0.14613886n",
"Loss: 0.14499943n",
"Loss: 0.1438764n",
"Loss: 0.14276858n",
"Loss: 0.14167488n",
"Loss: 0.14059444n",
"Loss: 0.1395265n",
"Loss: 0.13847035n",
"Loss: 0.13742542n",
"Loss: 0.13639122n",
"Loss: 0.13536726n",
"Loss: 0.13435322n",
"Loss: 0.1333487n",
"Loss: 0.13235345n",
"Loss: 0.13136718n",
"Loss: 0.13038966n",
"Loss: 0.12942067n",
"Loss: 0.12846005n",
"Loss: 0.1275076n",
"Loss: 0.1265632n",
"Loss: 0.12562671n",
"Loss: 0.12469797n",
"Loss: 0.123776905n",
"Loss: 0.12286339n",
"Loss: 0.12195734n",
"Loss: 0.12105861n",
"Loss: 0.12016721n",
"Loss: 0.11928297n",
"Loss: 0.118405856n",
"Loss: 0.11753576n",
"Loss: 0.11667266n",
"Loss: 0.11581642n",
"Loss: 0.11496706n",
"Loss: 0.11412445n",
"Loss: 0.11328854n",
"Loss: 0.11245928n"
]
}
],
"source": [
"for _ in 0..<100 { //1000n",
" let 𝛁model = regression.gradient { r -> Tensor<Float> inn",
" let ŷ = r(X)n",
" let loss = meanSquaredError(predicted: ŷ, expected: Y)n",
" print("Loss: \(loss)")n",
" return lossn",
" }n",
" optimizer.update(®ression, along: 𝛁model)n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Get Weights and Bias of the model"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.8645877 2.0458913rn"
]
}
],
"source": [
"let weight = Float(regression.layer1.weight[0][0])!n",
"let bias = Float(regression.layer1.bias[0])!n",
"print(weight, bias)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Test the model"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"Context.local.learningPhase = .inferencen",
"let score = regression(X)n",
"let y2 = score.reshaped(toShape: [100])"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"Nonen"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"plt.clf()n",
"plt.plot(x.makeNumpyArray(), y2.makeNumpyArray(), marker: "x")n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Import CoreML ProtoBuf Swift data strucures"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/ArrayFeatureExtractor.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/BayesianProbitRegressor.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/DataStructures.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/CategoricalMapping.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/CustomModel.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/DictVectorizer.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/FeatureTypes.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/FeatureVectorizer.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/GLMClassifier.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/GLMRegressor.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/Gazetteer.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/Identity.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/Imputer.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/Scaler.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/ItemSimilarityRecommender.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/Parameters.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/Normalizer.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/LinkedModel.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/NearestNeighbors.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/NonMaximumSuppression.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/OneHotEncoder.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/SVM.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/SoundAnalysisPreprocessing.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/TextClassifier.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/TreeEnsemble.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/VisionFeaturePrint.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/WordEmbedding.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/WordTagger.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/NeuralNetwork.pb.swift""
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [],
"source": [
"%include "./CoreMLProto/Model.pb.swift""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Export CoreML Model with Weights, Bias and Trainable Parameters"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [],
"source": [
"let coreModel = CoreML_Specification_Model.with {n",
" $0.specificationVersion = 4n",
" $0.description_p = CoreML_Specification_ModelDescription.with {n",
" $0.input = [CoreML_Specification_FeatureDescription.with {n",
" $0.name = "dense_input"n",
" $0.type = CoreML_Specification_FeatureType.with {n",
" $0.multiArrayType = CoreML_Specification_ArrayFeatureType.with {n",
" $0.shape = [1]n",
" $0.dataType = CoreML_Specification_ArrayFeatureType.ArrayDataType.doublen",
" }n",
" }n",
" }]n",
" $0.output = [CoreML_Specification_FeatureDescription.with {n",
" $0.name = "output"n",
" $0.type = CoreML_Specification_FeatureType.with {n",
" $0.multiArrayType = CoreML_Specification_ArrayFeatureType.with {n",
" $0.shape = [1]n",
" $0.dataType = CoreML_Specification_ArrayFeatureType.ArrayDataType.doublen",
" }n",
" }n",
" }]n",
" $0.trainingInput = [CoreML_Specification_FeatureDescription.with {n",
" $0.name = "dense_input"n",
" $0.type = CoreML_Specification_FeatureType.with {n",
" $0.multiArrayType = CoreML_Specification_ArrayFeatureType.with {n",
" $0.shape = [1]n",
" $0.dataType = CoreML_Specification_ArrayFeatureType.ArrayDataType.doublen",
" }n",
" }n",
" }, CoreML_Specification_FeatureDescription.with {n",
" $0.name = "output_true"n",
" $0.type = CoreML_Specification_FeatureType.with {n",
" $0.multiArrayType = CoreML_Specification_ArrayFeatureType.with {n",
" $0.shape = [1]n",
" $0.dataType = CoreML_Specification_ArrayFeatureType.ArrayDataType.doublen",
" }n",
" }n",
"n",
" }]n",
" $0.metadata = CoreML_Specification_Metadata.with {n",
" $0.shortDescription = "Trivial linear classifier"n",
" $0.author = "Jacopo Mangiavacchi"n",
" $0.license = "MIT"n",
" $0.userDefined = ["coremltoolsVersion" : "3.1"]n",
" }n",
" }n",
" $0.isUpdatable = truen",
" $0.neuralNetwork = CoreML_Specification_NeuralNetwork.with {n",
" $0.layers = [CoreML_Specification_NeuralNetworkLayer.with {n",
" $0.name = "dense_1"n",
" $0.input = ["dense_input"]n",
" $0.output = ["output"]n",
" $0.isUpdatable = truen",
" $0.innerProduct = CoreML_Specification_InnerProductLayerParams.with {n",
" $0.inputChannels = 1n",
" $0.outputChannels = 1n",
" $0.hasBias_p = truen",
" $0.weights = CoreML_Specification_WeightParams.with {n",
" $0.floatValue = [weight]n",
" $0.isUpdatable = truen",
" }n",
" $0.bias = CoreML_Specification_WeightParams.with {n",
" $0.floatValue = [bias]n",
" $0.isUpdatable = truen",
" }n",
" }n",
" }]n",
" $0.updateParams = CoreML_Specification_NetworkUpdateParameters.with {n",
" $0.lossLayers = [CoreML_Specification_LossLayer.with {n",
" $0.name = "lossLayer"n",
" $0.meanSquaredErrorLossLayer = CoreML_Specification_MeanSquaredErrorLossLayer.with {n",
" $0.input = "output"n",
" $0.target = "output_true"n",
" }n",
" }]n",
" $0.optimizer = CoreML_Specification_Optimizer.with {n",
" $0.sgdOptimizer = CoreML_Specification_SGDOptimizer.with {n",
" $0.learningRate = CoreML_Specification_DoubleParameter.with {n",
" $0.defaultValue = 0.03n",
" $0.range = CoreML_Specification_DoubleRange.with {n",
" $0.maxValue = 1.0n",
" }n",
" }n",
" $0.miniBatchSize = CoreML_Specification_Int64Parameter.with {n",
" $0.defaultValue = 1n",
" $0.set = CoreML_Specification_Int64Set.with {n",
" $0.values = [1]n",
" }n",
" }n",
" $0.momentum = CoreML_Specification_DoubleParameter.with {n",
" $0.defaultValue = 0n",
" $0.range = CoreML_Specification_DoubleRange.with {n",
" $0.maxValue = 1.0n",
" }n",
" }n",
" }n",
" }n",
" $0.epochs = CoreML_Specification_Int64Parameter.with {n",
" $0.defaultValue = 100n",
" $0.set = CoreML_Specification_Int64Set.with {n",
" $0.values = [100]n",
" }n",
" }n",
" $0.shuffle = CoreML_Specification_BoolParameter.with {n",
" $0.defaultValue = truen",
" }n",
" }n",
" }n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [],
"source": [
"let binaryModelData: Data = try coreModel.serializedData()"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [],
"source": [
"binaryModelData.write(to: URL(fileURLWithPath: "./s4tf_model_personalization.mlmodel"))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Swift",
"language": "swift",
"name": "swift"
},
"language_info": {
"file_extension": ".swift",
"mimetype": "text/x-swift",
"name": "swift",
"version": ""
}
},
"nbformat": 4,
"nbformat_minor": 2
}
You can run this notebook on Google Colab or install the S4TF toolchain and the Google Swift Jupyter Kernel on Linux or a Docker container from the following Google repo:
Third step: Compile Core ML model on demand and test / inference the model
Xcode provides a very nice native integration with Core ML, where you can easily drag and drop a Core ML file ino the project to automatically generate Swift wrapper code for the model and input/output features.
The Core ML API allows you to dynamically load a model from the file system or even from the network, compile the model on demand, and dynamically create features and convert from basic datatypes.
The following code snippet shows how to dynamically load, compile, and run inference on a Core ML model in any Swift project:
import Foundation
import CoreML
func compileCoreML(path: String) -> (MLModel, URL) {
let modelUrl = URL(fileURLWithPath: path)
let compiledUrl = try! MLModel.compileModel(at: modelUrl)
print("Compiled Model Path: (compiledUrl)")
return try! (MLModel(contentsOf: compiledUrl), compiledUrl)
}
func inferenceCoreML(model: MLModel, x: Float) -> Float {
let inputName = "dense_input"
let multiArr = try! MLMultiArray(shape: [1], dataType: .double)
multiArr[0] = NSNumber(value: x)
let inputValue = MLFeatureValue(multiArray: multiArr)
let dataPointFeatures: [String: MLFeatureValue] = [inputName: inputValue]
let provider = try! MLDictionaryFeatureProvider(dictionary: dataPointFeatures)
let prediction = try! model.prediction(from: provider)
return Float(prediction.featureValue(for: "output")!.multiArrayValue![0].doubleValue)
}
let (coreModel, compiledModelUrl) = compileCoreML(path: coreMLFilePath)
let prediction = inferenceCoreML(model: coreModel, x: 1.0)
print(prediction)
Fourth step: Local Core ML training / personalization
The final step is all about re-training the previously compiled Core ML model by passing a batch of training data.
The generateData Swift function is provided using Swift-based datatypes, i.e. the same noised linear data tensor used in the previous Swift Notebook for training the S4TF model and getting the initial weights and bias:
import Foundation
import CoreML
func generateData(sampleSize: Int = 100) -> ([Float], [Float]) {
let a: Float = 2.0
let b: Float = 1.5
var X = [Float]()
var Y = [Float]()
for i in 0..<sampleSize {
let x: Float = Float(i) / Float(sampleSize)
let noise: Float = (Float.random(in: 0..<1) - 0.5) * 0.1
let y: Float = (a * x + b) + noise
X.append(x)
Y.append(y)
}
return (X, Y)
}
func prepareTrainingBatch() -> MLBatchProvider {
var featureProviders = [MLFeatureProvider]()
let inputName = "dense_input"
let outputName = "output_true"
let (X, Y) = generateData()
for (x,y) in zip(X, Y) {
let multiArr = try! MLMultiArray(shape: [1], dataType: .double)
multiArr[0] = NSNumber(value: x)
let inputValue = MLFeatureValue(multiArray: multiArr)
multiArr[0] = NSNumber(value: y)
let outputValue = MLFeatureValue(multiArray: multiArr)
let dataPointFeatures: [String: MLFeatureValue] = [inputName: inputValue,
outputName: outputValue]
if let provider = try? MLDictionaryFeatureProvider(dictionary: dataPointFeatures) {
featureProviders.append(provider)
}
}
return MLArrayBatchProvider(array: featureProviders)
}
func train(url: URL) {
let configuration = MLModelConfiguration()
configuration.computeUnits = .all
configuration.parameters = [.epochs : 100]
let progressHandler = { (context: MLUpdateContext) in
switch context.event {
case .trainingBegin:
print("Training begin")
case .miniBatchEnd:
let batchIndex = context.metrics[.miniBatchIndex] as! Int
let batchLoss = context.metrics[.lossValue] as! Double
print("Mini batch (batchIndex), loss: (batchLoss)")
case .epochEnd:
let epochIndex = context.metrics[.epochIndex] as! Int
let trainLoss = context.metrics[.lossValue] as! Double
print("Epoch (epochIndex) end with loss (trainLoss)")
default:
print("Unknown event")
}
}
let completionHandler = { (context: MLUpdateContext) in
print("Training completed with state (context.task.state.rawValue)")
print("CoreML Error: (context.task.error.debugDescription)")
if context.task.state != .completed {
print("Failed")
return
}
let trainLoss = context.metrics[.lossValue] as! Double
print("Final loss: (trainLoss)")
let updatedModel = context.model
let updatedModelURL = URL(fileURLWithPath: retrainedCoreMLFilePath)
try! updatedModel.write(to: updatedModelURL)
print("Model Trained!")
print("Press return to continue..")
}
let handlers = MLUpdateProgressHandlers(
forEvents: [.trainingBegin, .miniBatchEnd, .epochEnd],
progressHandler: progressHandler,
completionHandler: completionHandler)
let updateTask = try! MLUpdateTask(forModelAt: url,
trainingData: prepareTrainingBatch(),
configuration: configuration,
progressHandlers: handlers)
updateTask.resume()
}
train(url: compiledModelUrl)
// easily wait for completition of the asyncronous training task
let _ = readLine()
let retrainedModel = try! MLModel(contentsOf: URL(fileURLWithPath: retrainedCoreMLFilePath))
let prediction = inferenceCoreML(model: retrainedModel, x: 1.0)
print(prediction)
Full source code for this end-to-end test is available in the following repo:
Next steps
The next step will be to automate the S4TF-to-Core ML export process, defining specific Swift for TensorFlow extensions to generalize the process. Stay tuned, and please contact me if you would like to contribute. Thanks!
Comments 0 Responses