Advanced Tips for Core ML

Core ML is the most popular and powerful framework for adding machine learning and AI to iOS apps. High-level APIs provided by tools like Turi Create and Create ML make it possible to train mobile-friendly models without ML expertise.

But as projects grow in scale and complexity, it’s often necessary to dive deeper into the capabilities of Core ML to deliver the best user experiences possible.

In this post, I’ll discuss some of the more advanced techniques for creating and manipulating Core ML models. Using an artistic style transfer model as an example, I’ll cover four common scenarios:

  1. Converting custom Keras layers to Core ML.
  2. Manually adding layers to an existing Core ML model.
  3. Creating a model with flexible input and output sizes.
  4. Compressing the model with quantization.

Converting custom Keras layers to Core ML.

Sometimes models make use of custom layers or operations that aren’t directly supported by training frameworks like Keras. For example, a key innovation in Ulyanov et al’s improvement on Johnson et al’s fast style transfer model is the substitution of instance normalization for batch normalization in the neural network.

Unfortunately, the official Keras framework doesn’t provide an Instance Normalization layer, and we’re left to use the one included in the keras-contrib library or implement it ourselves. Because Apple’s coremltools library does not support keras-contrib, we’ll end up with the following error when we try to convert our model:

Though the coremltools converter doesn’t support Instance Normalization, Core ML itself does. It’s an option provided to the batch normalization layer. To fix our conversion process, we’ll perform the following steps:

  1. Tell coremltools to create custom layer placeholders for layers it can’t convert instead of throwing an error.
  2. Create an instance normalization layer specification directly from the Core ML protocol buffers.
  3. Replace the custom layer placeholders with the correct specification.

We can instruct coremltools to create custom layer placeholders for any layers it’s unable to convert by setting add_custom_layers=True and custom_conversion_functions={}. Note that in this case we’ll be using Core ML’s own layers instead of a custom conversion function, but the empty dictionary is still necessary. Here’s an example:

# Instance Norm
inpt = keras.layers.Input(shape=(500, 500, 3))
out = keras_contrib.layers.InstanceNormalization(axis=-1)(inpt)
keras_model = keras.models.Model(inpt, out)

mlmodel = coremltools.converters.keras.convert(
  keras_model,
  add_custom_layers=True,
  custom_conversion_functions={}
)

mlmodel.get_spec()

"""
Output:
specificationVersion: 2
description {
  input {
    name: "input1"
    type {
      multiArrayType {
        shape: 3
        shape: 500
        shape: 500
        dataType: DOUBLE
      }
    }
  }
  output {
    name: "output1"
    type {
      multiArrayType {
        shape: 3
        shape: 500
        shape: 500
        dataType: DOUBLE
      }
    }
  }
}
neuralNetwork {
  layers {
    name: "instance_normalization_2"
    input: "input1"
    output: "output1"
    custom {
    }
  }
}
"""

Inspecting the model specification output, we can see coremltools has included the instance normalization layer in the network with the correct name, input, and output, along with an empty set of custom layer options.

Now, we’ll create the proper instance normalization layer specification for our model. The Core ML specification is implemented in Protocol Buffers, and we can create new layers in the same way you’d manipulate a protobuf object. Below is a function that creates a Core ML instance normalization layer from a Keras layer. Inline comments describe the process.

def create_instance_normalization_spec(layer):
    """Convert a DeprocessStylizedImage Keras layer to Core ML.

    Args:
        layer (keras.layers.Layer): An Instance Normalization Keras layer.

    Returns:
        spec (NeuralNetwork_pb2.NeuralNetworkLayer): a core ml layer spec
    """

    # Extract the layer inputs and outputs from Keras and create 
    # equivalent names for Core ML.
    input_name = layer._inbound_nodes[0].inbound_layers[0].name
    input_name += '_output'
    output_name = layer.name + '_output'

    # Create a new Neural Network Layer object from the
    # Core ML protobuf spec and set properties.
    spec_layer = NeuralNetwork_pb2.NeuralNetworkLayer()
    spec_layer.name = layer.name
    spec_layer.input.append(input_name)
    spec_layer.output.append(output_name)

    # Layer types in Core ML are defined by the parameters
    # provided to the layer. To make this a normalization layer,
    # we create a batchnorm layer param object
    spec_layer_params = spec_layer.batchnorm

    # Extract parameters from Keras layer
    weights = layer.get_weights()
    channels = weights[0].shape[0]

    # Parameter arrangement in Keras: gamma, beta, mean, variance
    idx = 0
    gamma, beta = None, None
    if layer.scale:
        gamma = weights[idx]
        idx += 1
    if layer.center:
        beta = weights[idx]
        idx += 1

    epsilon = layer.epsilon or 1e-5

    # Set the parameters
    spec_layer_params.channels = channels
    spec_layer_params.gamma.floatValue.extend(map(float, gamma.flatten()))
    spec_layer_params.beta.floatValue.extend(map(float, beta.flatten()))
    spec_layer_params.epsilon = epsilon
    spec_layer_params.computeMeanVar = True
    spec_layer_params.instanceNormalization = True

    return spec_layer

The function above returns an instance of a Core ML layer protocol buffer for performing instance normalization on inputs. The last step is to replace the placeholder layers in our network with this specification.

instance_norm_spec = create_instance_normalization_spec(keras_model.layers[-1])
# Hook the layer up to the global model input and output
instance_norm_spec.input[:] = ["input1"]
instance_norm_spec.output[:] = ["output1"]
# Replace the custom layer placeholder with the new instance norm layer
mlmodel._spec.neuralNetwork.layers[-1].CopyFrom(instance_norm_spec)
mlmodel.get_spec()


"""
Output:

specificationVersion: 2
description {
  input {
    name: "input1"
    type {
      multiArrayType {
        shape: 3
        shape: 500
        shape: 500
        dataType: DOUBLE
      }
    }
  }
  output {
    name: "output1"
    type {
      multiArrayType {
        shape: 3
        shape: 500
        shape: 500
        dataType: DOUBLE
      }
    }
  }
}
neuralNetwork {
  layers {
    name: "instance_normalization_4"
    input: "input1"
    output: "output1"
    batchnorm {
      channels: 3
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 0.0010000000474974513
      gamma {
        floatValue: 1.0
        floatValue: 1.0
        floatValue: 1.0
      }
      beta {
        floatValue: 0.0
        floatValue: 0.0
        floatValue: 0.0
      }
    }
  }
}
"""

Our Core ML model now has the proper instance normalization layer.

Adding new layers to an existing Core ML model.

In the previous section, we dealt with converting unsupported custom layers. In this section, we’ll add entirely new layers directly to an existing Core ML model. Style transfer models sometimes suffer from edge artifacts due to padding on convolution layers.

If same padding is used, convolution windows on the edges of an image will have a disproportionate number of zero values and will distort style transfer results for those pixels.

This problem can be remedied by using reflection padding to create a visually consistent border around the input image, then using cropping at the end to keep the original size consistent.

Unfortunately, reflection padding isn’t supported by Keras, making it necessary to add the layer manually in Core ML. Here’s how:

def add_reflective_padding_and_crop(mlmodel, padding_size=20):
    """Add reflective padding and crop layers to remove edge artifcats.

    Because the convolution layers rely on 'same' padding, stylized images have
    a small ring of distortion around the outer edge. This can be eliminated
    with reflective padding on the input image. This method modifies the
    original MLModel spec to add a padding layer after the input and a crop
    layer before the output to remove the padding at the end.

    Args:
        mlmodel (coremltools.models.MLModel): an MLModel spec.
        padding_size (Optional, int): the number of pixels to pad.
    
    Returns:
        new_mlmodel (coremltools.models.MLModel): a new MLModel spec.
    """
    new_spec = mlmodel.get_spec()
    # Clear all the layers
    while new_spec.neuralNetwork.layers:
        new_spec.neuralNetwork.layers.pop()

    # Add a reflective padding layer first
    spec_layer = new_spec.neuralNetwork.layers.add()
    spec_layer.name = 'padding_1'
    spec_layer.input.append('image')
    spec_layer.output.append('padding_1_output')
    spec_layer_params = spec_layer.padding
    spec_layer_params.reflection.MergeFromString(b'')             
    height_border = spec_layer_params.paddingAmounts.borderAmounts.add()
    height_border.startEdgeSize = padding_size
    height_border.endEdgeSize = padding_size
    width_border = spec_layer_params.paddingAmounts.borderAmounts.add()
    width_border.startEdgeSize = padding_size
    width_border.endEdgeSize = padding_size

    # Add the rest of the layers
    for layer in mlmodel._spec.neuralNetwork.layers:
        spec_layer = new_spec.neuralNetwork.layers.add()
        spec_layer.MergeFrom(layer)

    # Crop the padding as a final layer.
    spec_layer = new_spec.neuralNetwork.layers.add()
    spec_layer.name = 'crop_1'
    spec_layer.input.append(new_spec.neuralNetwork.layers[-2].name + '_output')
    spec_layer.output.append('stylizedImage')
    spec_layer_params = spec_layer.crop          
    height_border = spec_layer_params.cropAmounts.borderAmounts.add()
    height_border.startEdgeSize = padding_size
    height_border.endEdgeSize = padding_size
    width_border = spec_layer_params.cropAmounts.borderAmounts.add()
    width_border.startEdgeSize = padding_size
    width_border.endEdgeSize = padding_size

    # Fix the inputs and outputs for the padding and crop layers
    new_spec.neuralNetwork.layers[1].input.pop()
    new_spec.neuralNetwork.layers[1].input.append('padding_1_output')

    new_spec.neuralNetwork.layers[-2].output.pop()
    new_spec.neuralNetwork.layers[-2].output.append(
        new_spec.neuralNetwork.layers[-2].name + '_output')

    return coremltools.models.MLModel(new_spec)

Flexible input and output sizes

For simplicity and performance, machine learning models are often trained with fixed input and output dimensions. Input data that doesn’t conform to these dimension requirements is scaled and / or cropped to match. When integrated into an application, though, these fixed dimensions might not be convenient or performant.

In the case of artistic style transfer, models are configured to accept and produce images of a fixed size, like 256 by 256 pixels, to speed up training.

During inference, though, we want to support high resolution images and images of different aspect ratios. We’d also like to avoid creating dozens of different models to support each configuration.

Thankfully, Apple added support for models with flexible input and output shapes in Core ML 2. As long as a model is fully convolutional, images of arbitrary sizes can be fed as input to our style transfer models.

This allows us to support a wider variety of use cases and control performance across devices (e.g. by setting a maximum image input sizes for users on older, less powerful phones).

from coremltools.models.neural_network import flexible_shape_utils

def make_mlmodel_flexible(spec, size_range=(100, 1920):
    """Make input and output sizes of a Core ML model flexible.

    Args:
        spec (NeuralNetwork_pb2): a Core ML neural network spec
        size_range ([Int]): a tuple containing the min and max input sizes.
    """
    size_range_spec = flexible_shape_utils.NeuralNetworkImageSizeRange()
    size_range_spec.add_width_range(size_range)
    size_range_spec.add_height_range(size_range)
    flexible_shape_utils.update_image_size_range(
        spec, feature_name='image', size_range=size_range_spec
    )

    size_range_spec = flexible_shape_utils.NeuralNetworkImageSizeRange()
    size_range_spec.add_width_range(size_range)
    size_range_spec.add_height_range(size_range)
    flexible_shape_utils.update_image_size_range(
        spec, feature_name='stylizedImage', size_range=size_range_spec
    )

In this case, we added a continuous range of supported input sizes. It’s also possible to allow a discrete set of input sizes.

Quantize a model to shrink its size.

So far we’ve converted a model with custom layers, manually added reflective padding to fix edge artifacts, and made input sizes flexible for users.

As a final step, lets compress our Core ML model to keep our bundle size low. We can do this with Core ML’s quantization tools. The function below quantizes Core ML models and is safe to use on any platform.

from coremltools.models.neural_network import quantization_utils

def quantize_model(mlmodel, nbits, method='linear'):
    """Quantize the weights of an mlmodel to a specific number of bits.

    Args:
        mlmodel (coremltools.models.MLModel): A Core ML model
        nbits (int): the bit depth of the quantized weights.
        method (string): the quantization method.
    
    Returns:
        A quantized Core ML Model.
    """
    quantized_model = quantization_utils.quantize_weights(
        mlmodel, nbits, linear)
    # If we are on a Mac OS, quantization will return a Model,
    # otherwise it will return just the model spec and we'll need to create one
    if type(quantized_model) == coremltools.models.MLModel:
        return quantized_model
    return coremltools.models.MLModel(quantized_model)

In practice, we’ve found little to no decrease in accuracy using 8-bit quantization to shrink models to 25% of their original size. You can dive deeper into quantization here.

Conclusion

This post covered a few advanced techniques for working with Core ML models. We fixed a broken model conversion process using custom layers.

We added layers to an existing Core ML model by hand. We made use of flexible input and output sizes to give users more options and decrease the number of models we need to maintain. Finally, we used quantization to shrink a model’s size to keep our app bundle small.

Avatar photo

Fritz

Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *