The Key to Finding The One on Tinder? Vectors & Machine Learning

An intro to preference learning and pairing algorithms using techniques from linear algebra, statistics, and machine learning

I recently had a friend text me and say, “Andrew, I’ve been getting a ton of matches on Tinder, but I still haven’t been able to find the one. I think it’s because I’m not using enough linear algebra. Can you help me out?”

And I replied, “Wow, that’s a weirdly specific question. This sounds like a fake situation. But yes, of course, I’ll see what I can do.” In this article, we’ll set out to help my friend find the one. But how?

We’re going to break this question up into a few parts. In part 1, we’ll take a look at building vector representations for human characteristics and calculating alignment between two vectors. In part 2, we’ll use a preference learning algorithm to weight categories and return the most relevant matches. And hopefully, we can solve my friend’s question.

Part 1: Vector Alignment

Our first task is to discretize human characteristics into categories. The bark of that sentence is much worse than its bite. To discretize means to represent a set using distinct and individual quantities. For example, we can discretize the human characteristic of introversion into 5 categories: Very Introverted, Somewhat Introverted, Neither Introverted nor Extroverted, Somewhat Extroverted, Very Extroverted. Here, we’re using the Likert Scale.

And we can do this for any human characteristic like sense of humor or intelligence. However, we can also do this for any interest. Let’s say I love going camping — one of my characteristics could range from Not at All Interested in Camping to Very Interested in Camping. However, building your set of characteristics is up to you. Mathematics can’t help you too much here.

Let’s say I have a set of characteristics C = {C₀, C₁, C₂, C₃} where C₀ could be Introversion, C₁ could be sense of humor, and so on. Each Cᵢ will be on the Likert Scale with each response corresponding to a number from 0 to 4. In the case of C, 0=Very Introverted, 2 = Neither Introverted or Extroverted, and 4=Very Extroverted. For any human, they will have a vector representation for C where each Cᵢ represents their score on the Likert Scale for that characteristic. We’ll call this vector representation a C-vector.

Consider the example below. Suppose the only categories I care about are in another human are Introversion, Humor, and Organization. And suppose I’m a bit extroverted, I can crack a couple of jokes, and I like to keep my house clean. This results in a C-vector of {3, 1, 4}. Suppose my ideal significant other is gregarious, a comedian, and has a messy room. This results in a vector of {4, 0, 1}. This is shown below.

Now that we have two vectors, we can measure the distance using Euclidean Distance. For two vectors, p and q, the Euclidean Distance between these vectors is the square root of the sum of the squared difference between each element.

We can divide this result by √(4²*N) where N is the number of categories. This will normalize our result to a number between 0 and 1. We can interpret this value as percentage alignment. We’ll call this the normalized Euclidean distance (NED).

However, as the old adage goes — “Opposites attract”. So, we’ll define a percentage to represent the ideal alignment. If you believe total opposites attract, your ideal alignment will be 1. If you want to date a carbon copy of yourself, it will be 0. Best of luck with that, Narcissus.

So, for the example above with vectors {3, 1, 4} and {4, 0, 1}, the NED is 47.8%. We can return ideal matches by seeing which results are closest to your ideal alignment. But, going back to our original question, I gave my friend this algorithm, and they still haven’t found the one. So, naturally, we turn to more mathematics and machine learning.

Intuitively, we know that not all preferences are created equal. For example, I may care more about humor than about organization. In this case, alignment in humor should be more important while alignment in organization should be less important. We should weight the categories appropriately. But how? Using machine learning!

Part 2: Machine Learning

Suppose we have an existing set of matches. We’ll call this set M={M₀, M₁, M₂, … , Mᵣ} with r matches. We’ll define our set of categories as C = {C₀, C₁, C₂, … , Cᵥ} with v categories. We’ll partition our set M into two subsets M₁ and M₂. M₁ contains our matches with whom we had positive interactions. M₂ contains our matches with whom we had negative interactions.

The next step is to look for concurrencies among subsets M₁ and M₂. The idea here is simple. We can look at the average for each category and find the ideal person vector.

We’ll continually update this vector as M₁ and M₂ grow. We’ll call this vector Avg-M1. Similarly, we can create an unideal person vector by averaging the observed categories across M₂. We’ll call this Avg-M2.

Then, we can take a new person vector, calculate the distance from Avg-M1 and Avg-M2. We’ll then return results by seeing which vectors minimize the distance from Avg-M1 and maximize the distance from Avg-M2.

We’ll use the data below to test our methods. Suppose we have two preferences among the 5 categories for both positive and negative matches. This is shown in green for M₁ and red for M₂.

As expected, all of the other categories are close to 2. We’ll simulate 10 new matches below. We expect Match-New-1 and Match-New-2 to perform well since they are close to Avg-M1 and far from Avg-M2. Similarly, we expect Match-New-3 and Match-New-4 to perform poorly since they are far from Avg-M1 and close to Avg-M2. The other matches are generated randomly and should not order in any particular way. The results are shown below.

And we get exactly what we expected! We’ve effectively weighted our categories. For the categories that have an even distribution (i.e. the randomly generated C4), the value of a new match in this category does not have a large effect. However, if the category is important, it will likely have a high or a low value. This is because 0 means Very X and 4 means Very Y. Misalignment here is more pronounced. By ranking our new matches, we ensure our most relevant categories are considered most heavily.

This algorithm readily generalizes to any pairing algorithm. Netflix could use it to recommend new movies. Barnes & Noble could use it to suggest similar books. Stockbrokers could use it to invest further into a sector or asset class by suggesting similar assets. And, of course, it works for Tinder.

Our approach here is strong because it only gets better with more data points. The more categories, the better. The more matches, the better. The more precision in your scale, the better. The more nuanced, the better. But, can we build upon this approach? Let’s consider the diagram below.

Suppose we fed this model a set of training data. Let’s define Y=1 if it is a positive match and Y=0 if it is a negative match. This is known as binary classification. Recall that M1 is a set of all matches where Y=1 and M2 is a set of all matches where Y=0. For each element in M1 and M2, we know C1, C2, and C3. Therefore, we can train this model to assign weights appropriately to generate the correct output.

This is a very basic neural network, known as a single layer perceptron. Our function f() is known as an activation function. Adding more inputs and outputs improves accuracy by allowing us to capture more nuance.

Adding more layers yields a deep neural network that can classify with high accuracy. For those interested in exploring neural networks more, see here. For those interested in exploring neural networks and preference learning, here is a thorough technical paper.