Skip to content
Advertisement

Group by multiple columns and limit per group – Postgres

I’m creating a messaging app as a side project and I’m trying to query a user’s conversations efficiently.

The messages table structure is basic right now with some dummy data:

I’d like to be able to query the DB and group by “conversation” – A.K.A any rows that have the same sender_id or receiver_id in either column – rows (1, 2, 3, 7), (4, 6), (5). I’d like to be able to limit each “group” to n rows and order them by the created_at column. It would ideally look like (created_at values are arbitrary numbers to show descending values):

Ideally there would be an additional column added that would number each group (easy to create multi-dimensional array.

So far I’ve been able to “group” by sender/receiver ids, order by created_at, and limit the number per group. However, It’s not quite right. Here’s the query:

My result set looks like this:

You can see that row 3 and 6 should be grouped but aren’t.

Advertisement

Answer

You can use rank(). To limit the number of records per conversation (ie sender/receiver or receiver/sender tuple), you can use a partition like least(sender_id, receiver_id), greatest(sender_id, receiver_id):

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement