Style Transfer iOS Application Using Convolutional Neural Networks

Training a Style Transfer Model using Turi Create to Create Artistic Images

Neural style transfer allows you to recover the “style” of an image and apply it to content another. This allows developers, with very little effort, to copy the style of a great master and apply it to the picture of their cat (as just one example). Very interesting perspective!

Neural style transfer, or style transfer, has recently become quite popular, especially with the notoriety of applications such as Prisma. It emerges from a context of strong development of neural networks for various applications, and especially for art. And a few months ago, Deep Dream appeared — a program that highlights non-existent patterns in images, creating what could be considered an artistic style in its own right.

This article will cover a bit of theory, then describe the step-by-step the creation of an iOS application and the training of a simple style transfer model.

Why a mobile application?

Let’s imagine you’ve built an application that’s centered on user generated content (mainly images), and you want to give your users the ability to add some style to their photos—the idea is to always give the user an incentive to be creative with their content and drive more engagement—and on-device style transfer makes this kind of continued engagement possible.

I can also picture different styles for every market—for example “Claude Monet” for the french market or “Edward Hopper” and “Andy Warhol” for U.S. market. The use cases are endless. You can also give the user the power to create their own style—this won’t be covered in this tutorial, but it’s an enticing possibility.

Why Turi Create ?

Throughout this tutorial, I’ll be using Turi Create, which is a high-level Python library that simplifies the development of custom machine learning models. Most importantly, you can export your model into a .mlmodel file that can be parsed by Xcode and used on-device. There are, of course, other methods for building mobile-ready models, which are mostly server-side solutions and use advanced techniques.

What is style transfer?

In an article published in September 2015, researchers from Tübingen, Germany, and Houston, TX, introduced an algorithm using deep learning to create “high-quality perceptual art images”. Their article introduced the idea that the representation of style and content can be separated into a certain type of neural network. This opened the pathway to neural style transfer, which was then quickly extended and improved by other articles— optimizations, applications for sound, and even applying styles to video.

To achieve this feat, the authors of the original paper built a network capturing the texture information of an image, but not the elemental organization of it. Once this texture information is stored in a network, it’s possible to apply it to a different image.

Style transfer is an optimization problem: we try to apply a pre-calculated model (the style) to an input image. For this, we define an objective function (loss function) that we want to minimize—this is a weighted sum of the error (loss) between the original image and the produced image, and the error between the original style and the applied one. By adjusting the weighting parameters, we can give more importance to the original image (the content image) or the style used (extracted from the style image).

That’s where deep learning comes in handy. The primary idea that underpins deep learning is to structure tasks in layers related to each other, performing operations of different levels of abstraction. For example, an image recognition network could consist of a layer working on the pixels, connected to a layer recognizing simple borders, which is itself connected to a pattern recognizing layers, then parts of objects, then objects, etc.

As a multi-layer system, deep learning is particularly subject to the diversity of possible topologies. For style transfer, the main network used is VGG (Visual Geometry Group). It’s a network of 16 layers of neurons, known to obtain good results in image recognition. You can check out the architecture of the network used in Turi Create.

Train the model

We’re going to use the build in method offered by Turi Create.

Here are all the steps:

  1. GPU use: I’m using all the GPU I have on my computer (-1), but you can change usage to (0) if none or (n) for the number of GPUs you want to use.
  2. Load the styles: Loading the folder that contains all the style images.
  3. Load the training images: Loading the folder that contains all the content images that the model will train on to perfect the style transfer.
  4. Create the model: Turi Create will do this work for us — we just need the training data, the style images, and the number of iterations (10,000 by default). I chose to try 10,000, 20,000, and 30.000 iterations. You’ll see shortly that this will make a huge difference. I also chose to use “Data Augmentation” which will resize, crop, and rotate images to help diversify the dataset.
  5. Test the model: Loading images to test the model with all the styles.
  6. Save the model: Saving the model so that we can use it later if we want to export it in another format.
  7. Export a .mlmodel file: That’s the file format that can be parsed by Xcode for our iOS application.
import turicreate as tc

tc.config.set_num_gpus(-1)

# Load the style and content images
styles = tc.load_images('/style/')
content = tc.load_images('/content/')

# Create a StyleTransfer model
model = tc.style_transfer.create(styles,
                                content,
                                max_iterations=30000,
                                _advanced_parameters={"style_loss_mult":[1e-4, 1e-4, 1e-4, 1e-4],
                                                      "use_augmentation": True})

# Load some test images
test_images = tc.load_images('/test/')

# Stylize the test images and explore it
stylized_images = model.stylize(test_images)
stylized_images.explore()

# Save the model for later use in Turi Create
model.save('mymodel-30000.model')

# Export for use in Core ML
model.export_coreml('MyStyleTransfer-30000.mlmodel')

Evaluate different models

I have used two styles:

  1. Style number 1: It’s a Moroccan Zellige, which is a technique typical of Maghreb architecture that consists of assembling pieces of enameled terracotta tiles of different colors to achieve a geometric decoration. The shards of faience are sometimes so fine that it’s a true ceramic marquetry.
  2. Style number 2: An art piece made by a young Moroccan artist. You can check out his Instagram page.
  • Style number 1:
  • Style number 2:

We can clearly see that the number of iterations has an effect on the quality of the transfer of textures from the style image to the input image. Some would also argue that we can increase the number of content images—in may case I’ve used 19 images, but try to include as many images as you can.

Training time

I’ve been training with the free Tesla K80 GPU offered by Google, and it’s still a lot of calculation. What’s interesting is that the training time will increase linearly with the number of iterations, which is good.

It took some time to force Turicreate to use the GPU on Colab, but it’s working perfectly now.

Build the iOS application

Create a new project

To begin, we need to create an iOS project with a single view app, make sure to choose Storyboard in the “User interface” dropdown menu (Xcode 11 only):

Now we have our project ready to go. I don’t like using storyboards myself, so the app in this tutorial is built programmatically, which means no buttons or switches to toggle — just pure code 🤗.

To follow this method, you’ll have to delete the main.storyboard and set your SceneDelegate.swift file (Xcode 11 only) like so:

func scene(_ scene: UIScene, willConnectTo session: UISceneSession, options connectionOptions: UIScene.ConnectionOptions) {
        // Use this method to optionally configure and attach the UIWindow `window` to the provided UIWindowScene `scene`.
        // If using a storyboard, the `window` property will automatically be initialized and attached to the scene.
        // This delegate does not imply the connecting scene or session are new (see `application:configurationForConnectingSceneSession` instead).
        guard let windowScene = (scene as? UIWindowScene) else { return }
        
        window = UIWindow(frame: windowScene.coordinateSpace.bounds)
        window?.windowScene = windowScene
        window?.rootViewController = ViewController()
        window?.makeKeyAndVisible()
    }

With Xcode 11 you’ll have to change the Info.plist file like so:

You need to delete the “Storyboard Name” in the file, and that’s about it.

Main ViewController

Now let’s set our ViewController with the buttons and a logo. I used some custom buttons in the application — you can obviously use the system button.

First, you need to inherit from UIButton and create your own custom button — we inherit from UIButton because the custom button ‘is’ a UIButton, so we want to keep all its properties and only inherit to change the look of it:

import UIKit

class Button: UIButton {
    override func awakeFromNib() {
        super.awakeFromNib()
        titleLabel?.font = UIFont(name: "Avenir", size: 12)
    }
}
import UIKit

class BtnPlein: Button {
    override func awakeFromNib() {
        super.awakeFromNib()
        
    }
    
    var myValue: Int
    
    ///Constructor: - init
    override init(frame: CGRect) {
        // set myValue before super.init is called
        self.myValue = 0
        
        super.init(frame: frame)
        layer.borderWidth = 6/UIScreen.main.nativeScale
        layer.backgroundColor = UIColor(red:0.24, green:0.51, blue:1.00, alpha:1.0).cgColor
        setTitleColor(.white, for: .normal)
        titleLabel?.font = UIFont(name: "Avenir", size: 22)
        layer.borderColor = UIColor(red:0.24, green:0.51, blue:1.00, alpha:1.0).cgColor
        layer.cornerRadius = 5
    }
    
    required init?(coder aDecoder: NSCoder) {
        fatalError("init(coder:) has not been implemented")
    }
}
import UIKit

class BtnPleinLarge: BtnPlein {
    override func awakeFromNib() {
        super.awakeFromNib()
        contentEdgeInsets = UIEdgeInsets(top: 0, left: 16, bottom: 0, right: 16)
    }
}

BtnPleinLarge is our new button, and we use it to create our main two buttons for ViewController.swift, our main view.

I have two styles in my model, so I’ll make one button for each style.

Now, set the layout and buttons with some logic as well:

let style_1: BtnPleinLarge = {
        let button = BtnPleinLarge()
        button.translatesAutoresizingMaskIntoConstraints = false
        button.addTarget(self, action: #selector(buttonToUploadStyle1(_:)), for: .touchUpInside)
        button.setTitle("Style 1", for: .normal)
        let icon = UIImage(named: "upload")?.resized(newSize: CGSize(width: 50, height: 50))
        button.addRightImage(image: icon!, offset: 30)
        button.backgroundColor = .systemOrange
        button.layer.borderColor = UIColor.systemOrange.cgColor
        button.layer.shadowOpacity = 0.3
        button.layer.shadowColor = UIColor.systemOrange.cgColor
        button.layer.shadowOffset = CGSize(width: 1, height: 5)
        button.layer.cornerRadius = 10
        button.layer.shadowRadius = 8
        button.layer.masksToBounds = true
        button.clipsToBounds = false
        button.contentHorizontalAlignment = .left
        button.layoutIfNeeded()
        button.contentEdgeInsets = UIEdgeInsets(top: 0, left: 0, bottom: 0, right: 20)
        button.titleEdgeInsets.left = 0
        
        return button
    }()
    
    let style_2: BtnPleinLarge = {
        let button = BtnPleinLarge()
        button.translatesAutoresizingMaskIntoConstraints = false
        button.addTarget(self, action: #selector(buttonToUploadStyle2(_:)), for: .touchUpInside)
        button.setTitle("Style 2", for: .normal)
        let icon = UIImage(named: "upload")?.resized(newSize: CGSize(width: 50, height: 50))
        button.addRightImage(image: icon!, offset: 30)
        button.backgroundColor = .systemRed
        button.layer.borderColor = UIColor.systemRed.cgColor
        button.layer.shadowOpacity = 0.3
        button.layer.shadowColor = UIColor.systemRed.cgColor
        button.layer.shadowOffset = CGSize(width: 1, height: 5)
        button.layer.cornerRadius = 10
        button.layer.shadowRadius = 8
        button.layer.masksToBounds = true
        button.clipsToBounds = false
        button.contentHorizontalAlignment = .left
        button.layoutIfNeeded()
        button.contentEdgeInsets = UIEdgeInsets(top: 0, left: 0, bottom: 0, right: 20)
        button.titleEdgeInsets.left = 0
        
        return button
    }()

We now need to set up some logic. It’s important to change the Info.plist file and add a property so that an explanation of why we need access to the camera and the library is given to the user. Add some text to the “Privacy — Photo Library Usage Description”:

  @objc func buttonToUploadStyle1(_ sender: BtnPleinLarge) {
        self.style = 1
        if UIImagePickerController.isSourceTypeAvailable(.photoLibrary) {
            let imagePicker = UIImagePickerController()
            imagePicker.delegate = self
            imagePicker.sourceType = .photoLibrary
            imagePicker.allowsEditing = false
            self.present(imagePicker, animated: true, completion: nil)
        }
    }
    
    @objc func buttonToUploadStyle2(_ sender: BtnPleinLarge) {
        self.style = 2
        if UIImagePickerController.isSourceTypeAvailable(.photoLibrary) {
            let imagePicker = UIImagePickerController()
            imagePicker.delegate = self
            imagePicker.sourceType = .photoLibrary
            imagePicker.allowsEditing = false
            self.present(imagePicker, animated: true, completion: nil)
        }
    }

Of course, you need to set up the layout and add the subviews to the view, too. I’ve added a logo on top of the view as well:

override func viewDidLoad() {
        super.viewDidLoad()
        view.backgroundColor = #colorLiteral(red: 0.9693209529, green: 0.9324963689, blue: 0.973600328, alpha: 1)
        addSubviews()
        setupLayout()
    }
    
    func addSubviews() {
        view.addSubview(logo)
        view.addSubview(style_1)
        view.addSubview(style_2)
    }
    
    func setupLayout() {
        
        logo.centerXAnchor.constraint(equalTo: self.view.centerXAnchor).isActive = true
        logo.topAnchor.constraint(equalTo: self.view.safeTopAnchor, constant: 20).isActive = true
        
        style_2.centerXAnchor.constraint(equalTo: view.centerXAnchor).isActive = true
        style_2.bottomAnchor.constraint(equalTo: view.bottomAnchor, constant: -120).isActive = true
        style_2.widthAnchor.constraint(equalToConstant: view.frame.width - 40).isActive = true
        style_2.heightAnchor.constraint(equalToConstant: 80).isActive = true
        
        style_1.centerXAnchor.constraint(equalTo: view.centerXAnchor).isActive = true
        style_1.widthAnchor.constraint(equalToConstant: view.frame.width - 40).isActive = true
        style_1.heightAnchor.constraint(equalToConstant: 80).isActive = true
        style_1.bottomAnchor.constraint(equalTo: style_2.topAnchor, constant: -40).isActive = true
    }

Output ViewController: Where We Show Our Result

Here, we need two things:

  1. Our transformed image:
    let outputImage: UIImageView = {
        let image = UIImageView(image: UIImage())
        image.translatesAutoresizingMaskIntoConstraints = false
        image.contentMode = .scaleAspectFit
        return image
    }()

2. A button to dismiss the view:

let dissmissButton: BtnPleinLarge = {
        let button = BtnPleinLarge()
        button.translatesAutoresizingMaskIntoConstraints = false
        button.addTarget(self, action: #selector(buttonToDissmiss(_:)), for: .touchUpInside)
        button.setTitle("Done", for: .normal)
        button.backgroundColor = #colorLiteral(red: 1, green: 0.1491314173, blue: 0, alpha: 1)
        button.layer.borderColor = #colorLiteral(red: 1, green: 0.1491314173, blue: 0, alpha: 1).cgColor
        return button
    }()

Don’t forget to add the subviews to the main view and set up the layout, too.

Set up the delegate

Before we can pass the image through the model, we need to convert the original the image to a 256×256 square image, which is the format expected by the model. I choose to use a square image so that I don’t lose much of the quality, and I also noticed that the model can support sizes up through 1024×1024, which is the size I choose. That means the image will have decent quality, unlike the pixelized 256×256 image.

Here’s our helper function:

    func pixelBuffer(from image: UIImage) -> CVPixelBuffer? {
        
        UIGraphicsBeginImageContextWithOptions(CGSize(width: 1024, height: 1024), true, 2.0)
        image.draw(in: CGRect(x: 0, y: 0, width: 1024, height: 1024))
        _ = UIGraphicsGetImageFromCurrentImageContext()!
        UIGraphicsEndImageContext()

        let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
        var pixelBuffer : CVPixelBuffer?
        let status = CVPixelBufferCreate(kCFAllocatorDefault, 1024, 1024, kCVPixelFormatType_32ARGB, attrs, &pixelBuffer)
        guard (status == kCVReturnSuccess) else {
            return nil
        }

        CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
        let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer!)

        let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
        let context = CGContext(data: pixelData, width: 1024, height: 1024, bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue)

        context?.translateBy(x: 0, y: 1024)
        context?.scaleBy(x: 1.0, y: -1.0)

        UIGraphicsPushContext(context!)
        image.draw(in: CGRect(x: 0, y: 0, width: 1024, height: 1024))
        UIGraphicsPopContext()
        CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))

        return pixelBuffer
    }

Now that we have our helper, we can access the image from the library with the help of ImagePickerController:

func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [UIImagePickerController.InfoKey : Any]) {
        if let pickedImage = info[UIImagePickerController.InfoKey.originalImage] as? UIImage {
            
            let model = MyStyleTransfer_30000()
            
            let numStyles  = 2
            let styleIndex = self.style! - 1
            
            let styleArray = try? MLMultiArray(shape: [numStyles] as [NSNumber], dataType: .double)
            
            for i in 0...((styleArray?.count)!-1) {
                styleArray?[i] = 0.0
            }
            styleArray?[styleIndex] = 1.0
            
            if let image = pixelBuffer(from: pickedImage) {
                do {
                    let predictionOutput = try model.prediction(image: image, index: styleArray!)

                    let ciImage = CIImage(cvPixelBuffer: predictionOutput.stylizedImage)
                    let tempContext = CIContext(options: nil)
                    let tempImage = tempContext.createCGImage(ciImage, from: CGRect(x: 0, y: 0, width: CVPixelBufferGetWidth(predictionOutput.stylizedImage), height: CVPixelBufferGetHeight(predictionOutput.stylizedImage)))
                    let controller = OutputViewController()
                    controller.outputImage.image = UIImage(cgImage: tempImage!)
                    picker.dismiss(animated: true, completion: nil)
                    self.present(controller, animated: true, completion: nil)
                } catch let error as NSError {
                    print("CoreML Model Error: (error)")
                }
            }
            
        }
            
        picker.dismiss(animated: true, completion: nil)
    }

Final result

I would say that any application that does any kind of image processing should have some kind of filters or style transfer, because users are steadily coming to expect it right inside the application.

With the help of Turicreate, developers don’t have any excuses to not implement it. One thing to remember is that the artistic images should be interesting enough to transfer compelling textures to the input image.

The biggest problem with style transfer is the amount of processing power needed to perfect the style. You can still use the free Tesla K8 GPU offered by Google, but it’s still comes up short when fine tuning the model to optimize for the best results. I would say that if styling is a core feature in your application, you can definitely invest in a GPU tower.

If you liked this piece, please clap and share it with your friends. If you have any questions don’t hesitate to send me an email at [email protected].

Avatar photo

Fritz

Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *