I have a table like this:
| db | chat_id | Admin | user |
+-------------+-------------------+------------+---------------+
| db_1 | chat_id1 | max | greg |
| db_1 | chat_id2 | max | bob |
| db_1 | chat_id3 | max | greg |
| db_1 | chat_id2 | helen | greg |
| db_2 | chat_id1 | alan | greg |
I would like to retrieve the number of chat performed by users for each database (db) and the last part where I fail, retrieve also a list of all mentors by users.
The final output should be like this for example (notice there is only one time max for greg in the admin column)
| db | user | nb_of_chat | admins |
+-------------+---------------+--------------+---------------+
| db_1 | greg | 3 | max, helen |
| db_1 | bob | 1 | max |
| db_2 | greg | 1 | alan |
I wrote the following query but it doesn’t aggregate the admins and i have separated nb_of chats/mentors.
SELECT db, user, COUNT(chat_id), admins
FROM "chat_db"."chats"
GROUP BY db, user, admins;
As expected I am getting the following result (but I only want it on one line by db/user with grouped admin in the same column):
| db | user | nb_of_chat | admins |
+-------------+---------------+--------------+---------------+
| db_1 | greg | 2 | max |
| db_1 | greg | 1 | helen |
| | | | |
Have you an idea how to perform it ?
Thank you for your time !
Regards.
Advertisement
Answer
Firsly, remove admins
from the group by
clause, since you want to aggregate it. Then, in Presto, you can do string aggregation as follows:
select db,user, count(*) no_of_chats,
array_join(array_agg(admins), ', ') all_admins
from "chat_db"."chats"
group by db, user;
You can add an order by
clause to array_agg()
if needed:
select db,user, count(*) no_of_chats,
array_join(array_agg(admins order by admins), ', ') all_admins
from "chat_db"."chats"
group by db, user;
Note that I changed count(chat_id)
to count(*)
: both are equivalent (since chat_id
probably is a non-null
able column), and the former is (sligthly) more efficient, and makes the intent clearer in my opinion.