Skip to content
Advertisement

MySQL substring to self join

I’m defining the relationship between the two tables using a join table. I want to arrange them in the order of many overlapping things. Currently, we are using subquery, is there a way to get the same result using join?

People                     FoodTable                  PeopleFood
ID |  NAME                 ID |  Food                 ID | PeopleId | FoodId  
1     BOB                  1     Hamberger            1     1           1
2     JOHN                 2     Pizza                2     1           2
3     KATY                 3     Chicken              3     1           3
4     MILLER               4     Salad                4     2           1
5     AMANDA               5     Sushi                5     2           2
                                                      6     2           3
                                                      7     3           2
                                                      8     3           3
                                                      9     4           3
                                                      10    4           5
                                                      11    5           5

When the table is defined in this way, I want to arrange food tastes similar to Bob’s. I’m doing it like this now.

SELECT people_id, COUNT(people_id) as count 
FROM peopleFood
WHERE food_id IN 
 (SELECT food_id FROM peopleFood
  WHERE people_id = 1)
AND people_id != 1
GROUP BY people_id
ORDER BY count DESC;


-- Result -------------
People_id | count
2           3
3           2
4           1

Is there a better way to change this method or use join? Thank you!!!

Advertisement

Answer

You have been inconsistent in your use of the table and column names –

Tables – PeopleFood in your sample data but you reference peopleFood in your query.

Columns – PeopleId and FoodId in your sample data but you reference people_id and food_id in your query.

Choose a naming convention and stick to it. Everyone has there own preference but the important thing is to be consistent.

The equivalent query with INNER JOIN instead of your sub-query is –

SELECT
    `pf2`.`people_id`,
    COUNT(`pf2`.`food_id`) as `count`
FROM `PeopleFood` `pf1`
INNER JOIN `PeopleFood` `pf2`
    ON `pf2`.`people_id` <> `pf1`.`people_id`
    AND `pf2`.`food_id` = `pf1`.`food_id`
WHERE `pf1`.`people_id` = 1
GROUP BY `pf2`.`people_id`
ORDER BY `count` DESC;

The performance difference between the two queries is unlikely to be noticeable and it might be argued that the intent is clearer in your version with the sub-query.

The surrogate key ID on your PeopleFood table should be dropped in favour of the compound “natural” primary key on people_id and food_id.

The Cost of Useless Surrogate Keys in Relationship Tables

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement