I’m defining the relationship between the two tables using a join table. I want to arrange them in the order of many overlapping things. Currently, we are using subquery, is there a way to get the same result using join?
People FoodTable PeopleFood ID | NAME ID | Food ID | PeopleId | FoodId 1 BOB 1 Hamberger 1 1 1 2 JOHN 2 Pizza 2 1 2 3 KATY 3 Chicken 3 1 3 4 MILLER 4 Salad 4 2 1 5 AMANDA 5 Sushi 5 2 2 6 2 3 7 3 2 8 3 3 9 4 3 10 4 5 11 5 5
When the table is defined in this way, I want to arrange food tastes similar to Bob’s. I’m doing it like this now.
SELECT people_id, COUNT(people_id) as count FROM peopleFood WHERE food_id IN (SELECT food_id FROM peopleFood WHERE people_id = 1) AND people_id != 1 GROUP BY people_id ORDER BY count DESC; -- Result ------------- People_id | count 2 3 3 2 4 1
Is there a better way to change this method or use join? Thank you!!!
Advertisement
Answer
You have been inconsistent in your use of the table and column names –
Tables – PeopleFood
in your sample data but you reference peopleFood
in your query.
Columns – PeopleId
and FoodId
in your sample data but you reference people_id
and food_id
in your query.
Choose a naming convention and stick to it. Everyone has there own preference but the important thing is to be consistent.
The equivalent query with INNER JOIN instead of your sub-query is –
SELECT `pf2`.`people_id`, COUNT(`pf2`.`food_id`) as `count` FROM `PeopleFood` `pf1` INNER JOIN `PeopleFood` `pf2` ON `pf2`.`people_id` <> `pf1`.`people_id` AND `pf2`.`food_id` = `pf1`.`food_id` WHERE `pf1`.`people_id` = 1 GROUP BY `pf2`.`people_id` ORDER BY `count` DESC;
The performance difference between the two queries is unlikely to be noticeable and it might be argued that the intent is clearer in your version with the sub-query.
The surrogate key ID
on your PeopleFood
table should be dropped in favour of the compound “natural” primary key on people_id
and food_id
.