Snapchat Lens Recipe: “Clone”

A quick look at how we built our SnapML-powered Lens that lets you instantly crop, clone, and manipulate objects in both 2D and 3D

Clone is the latest SnapML-powered Lens built by the team at Fritz AI. As the name suggests, the Lens lets users create digital clones of objects in the real world. If you haven’t already, try it out for yourself here:

In this post, I want to provide a quick behind-the-scenes look at how the Lens works and how we leveraged state-of-the-art AI models with Snap Lens Studio and SnapML to create the cloning effect.

Table of contents:

The AI

The first step of any AI / ML problem is to define the task. In this case, we wanted to look at an image or a section of an image, and separate out the foreground object from the background. This task is generally known as saliency detection and is closely related to image segmentation tasks (with two classes, background and foreground).

Using our own expertise designing small, efficient neural networks, along with some inspiration from the impressive U²-Net model, we created a saliency model that produced high-quality segmentations that also fits well under Lens Studio’s 10mb asset limit.

The Lens

With a model in hand, we needed to build the rest of the cloning experience around it in Lens Studio. We started with a list of user experience requirements. The Lens user would be able to:

  1. Manipulate a cropping box with just one finger
  2. Tap the screen to clone an object
  3. Move the cloned object in either 2D or 3D space
  4. Deleted cloned objects and start over

To achieve this experience, we used a fairly complicated render pipeline in Lens Studio. At a high level, it works like this:

  1. A perspective camera with device tracking builds a 3D map of the world, identifies horizontal planes, and renders what it sees to a render target.
  2. An orthographic camera starts with the image from the world camera and overlays the UI onto it before rendering to a separate render target that’s used for Live views—this way, the UI won’t be seen when watching a recorded Snap.
  3. When the box is tapped, a screen crop texture crops the capture target and copies the frame, saving it to another texture is used to create our cloned object.
  4. The freeze frame texture is then fed into our SnapML model to create an output texture, which will become the alpha channel masking out the background around our object.
  5. A custom material graph combines the original freeze frame texture with the alpha mask to produce a texture that contains only pixels belonging to our cloned object.
  6. That material is then applied to a Screen Image (in 2D mode) or a Mesh in the Cutout prefab by Snap (in 3D mode).
  7. Touch and manipulation controls allow a user to then move the cloned sticker around the scene.

While things can get complicated at times, Lens Studio and SnapML are extremely powerful tools, and we had a lot of fun creating this experience!

Fritz

Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *

wix banner square