I have one table of users.
users | user_id | name | | ------- | ------ | | 1 | Jerry | | 2 | George | | 3 | Elaine | | 4 | Kramer |
I have one table that links roles to users, and roles are assigned at a tree.
user_roles | user_id | role_id | tree_id | | ------- | ------- | ------- | | 1 | 5 | 1 | | 1 | 5 | 2 | | 2 | 6 | 1 | | 3 | 7 | 1 | | 4 | 8 | 1 |
I need to only return results where a user’s role is assigned at a certain tree_id, so I’m checking all the roles and trees. At the end I want it to return one row per user.
I’m using Knex and doing a query that looks like:
knex('users')
.leftJoin('user_roles', {'user.user_id': 'user_roles.user_id'})
.whereIn('user_roles.tree_id', arrayOfTreeIds)
.andWhere(moreFilters)
SELECT * FROM users LEFT JOIN user_roles on users.user_id = user_roles.user_id WHERE user_roles.tree_id in (1, 2, 3)
I’m getting five results back instead of four, though. If I try to SELECT DISTINCT it tells me I need to GROUP BY, but I can’t get that to work. What do I need to do to get only one result per user id?
Advertisement
Answer
You have a user that matches on two different tree_ids, so this multiplies the rows.
In pure SQL, you could use exists instead of a join:
SELECT *
FROM users u
WHERE EXISTS (
SELECT 1
FROM user_roles ur
WHERE ur.user_id = u.user_id AND ur.tree_id in (1, 2, 3)
)
Another option is aggregation:
SELECT u.* FROM users u INNER JOIN user_roles ur on u.user_id = ur.user_id WHERE ur.tree_id in (1, 2, 3) GROUP BY u.user_id
I changed the LEFT JOIN to an INNER JOIN, because that’s, in essence, what you want (and what your original query does).
You can even list the matched roles with string aggregation:
SELECT u.*, STRING_AGG(ur.tree_id::text, ',' ORDER BY ur.tree_id) tree_ids FROM users u INNER JOIN user_roles ur on u.user_id = ur.user_id WHERE ur.tree_id in (1, 2, 3) GROUP BY u.user_id
Disclaimer: I don’t know how to write this in knex!