Skip to content
Advertisement

SQL query to match similar customers

I’m trying to find the query in order to match similar customers.

To simplify the situation consider this scenario:

I have a table which contains a customer name and product purchased. customer name can have multiple purchases of same and different products. So firstly I can take distinct customer name and product name, so I see all customers and all products they purchased at least once.

Now I want a query to show me a sort of matching customers, according to the product they both purchased, so I want to count the similar products they purchased.

So I want to see for each pair of customers (pairing all the table) the amount of similar product they purchased.

Lets say the raw data is:

CustomerName | ProductName
A            | 1
A            | 2
A            | 1
A            | 3
B            | 1
B            | 2
B            | 4
C            | 2

Then I want to see the result of:

CustomerName1 | CustomerName2 | CountSimilarity
A             | B             | 2
A             | C             | 1
B             | C             | 1

And so on for all pairs of customers that have at least 1 similar product purchasing

Any suggestions how to approach this query?

The environment is SQL Server.

Thanks

Advertisement

Answer

Here is a self join approach:

SELECT t1.CustomerName, t2.CustomerName, COUNT(*) AS CountSimilarity
FROM yourTable t1
INNER JOIN yourTable t2
    ON t1.ProductName = t2.ProductName
WHERE
    t1.CustomerName < t2.CustomerName
GROUP BY
    t1.CustomerName, t2.CustomerName;

Two records are joined together above if their products match. Note that the inequality in the WHERE clause ensures that customer pairs do not appear in duplicate.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement