I have one table of users.
users
| user_id | name |
| ------- | ------ |
| 1 | Jerry |
| 2 | George |
| 3 | Elaine |
| 4 | Kramer |
I have one table that links roles to users, and roles are assigned at a tree.
user_roles
| user_id | role_id | tree_id |
| ------- | ------- | ------- |
| 1 | 5 | 1 |
| 1 | 5 | 2 |
| 2 | 6 | 1 |
| 3 | 7 | 1 |
| 4 | 8 | 1 |
I need to only return results where a user’s role is assigned at a certain tree_id, so I’m checking all the roles and trees. At the end I want it to return one row per user.
I’m using Knex and doing a query that looks like:
knex('users')
.leftJoin('user_roles', {'user.user_id': 'user_roles.user_id'})
.whereIn('user_roles.tree_id', arrayOfTreeIds)
.andWhere(moreFilters)
SELECT *
FROM users
LEFT JOIN user_roles on users.user_id = user_roles.user_id
WHERE user_roles.tree_id in (1, 2, 3)
I’m getting five results back instead of four, though. If I try to SELECT DISTINCT
it tells me I need to GROUP BY
, but I can’t get that to work. What do I need to do to get only one result per user id?
Advertisement
Answer
You have a user that matches on two different tree_id
s, so this multiplies the rows.
In pure SQL, you could use exists
instead of a join
:
SELECT *
FROM users u
WHERE EXISTS (
SELECT 1
FROM user_roles ur
WHERE ur.user_id = u.user_id AND ur.tree_id in (1, 2, 3)
)
Another option is aggregation:
SELECT u.*
FROM users u
INNER JOIN user_roles ur on u.user_id = ur.user_id
WHERE ur.tree_id in (1, 2, 3)
GROUP BY u.user_id
I changed the LEFT JOIN
to an INNER JOIN
, because that’s, in essence, what you want (and what your original query does).
You can even list the matched roles with string aggregation:
SELECT u.*, STRING_AGG(ur.tree_id::text, ',' ORDER BY ur.tree_id) tree_ids
FROM users u
INNER JOIN user_roles ur on u.user_id = ur.user_id
WHERE ur.tree_id in (1, 2, 3)
GROUP BY u.user_id
Disclaimer: I don’t know how to write this in knex!