I’m defining the relationship between the two tables using a join table. I want to arrange them in the order of many overlapping things. Currently, we are using subquery, is there a way to get the same result using join?
People                     FoodTable                  PeopleFood
ID |  NAME                 ID |  Food                 ID | PeopleId | FoodId  
1     BOB                  1     Hamberger            1     1           1
2     JOHN                 2     Pizza                2     1           2
3     KATY                 3     Chicken              3     1           3
4     MILLER               4     Salad                4     2           1
5     AMANDA               5     Sushi                5     2           2
                                                      6     2           3
                                                      7     3           2
                                                      8     3           3
                                                      9     4           3
                                                      10    4           5
                                                      11    5           5
When the table is defined in this way, I want to arrange food tastes similar to Bob’s. I’m doing it like this now.
SELECT people_id, COUNT(people_id) as count FROM peopleFood WHERE food_id IN (SELECT food_id FROM peopleFood WHERE people_id = 1) AND people_id != 1 GROUP BY people_id ORDER BY count DESC; -- Result ------------- People_id | count 2 3 3 2 4 1
Is there a better way to change this method or use join? Thank you!!!
Advertisement
Answer
You have been inconsistent in your use of the table and column names –
Tables – PeopleFood in your sample data but you reference peopleFood in your query.
Columns – PeopleId and FoodId in your sample data but you reference people_id and food_id in your query.
Choose a naming convention and stick to it. Everyone has there own preference but the important thing is to be consistent.
The equivalent query with INNER JOIN instead of your sub-query is –
SELECT
    `pf2`.`people_id`,
    COUNT(`pf2`.`food_id`) as `count`
FROM `PeopleFood` `pf1`
INNER JOIN `PeopleFood` `pf2`
    ON `pf2`.`people_id` <> `pf1`.`people_id`
    AND `pf2`.`food_id` = `pf1`.`food_id`
WHERE `pf1`.`people_id` = 1
GROUP BY `pf2`.`people_id`
ORDER BY `count` DESC;
The performance difference between the two queries is unlikely to be noticeable and it might be argued that the intent is clearer in your version with the sub-query.
The surrogate key ID on your PeopleFood table should be dropped in favour of the compound “natural” primary key on people_id and food_id.