Skip to content
Advertisement

Postgresql query to filter latest data based on 2 columns

Table Structure First

users table

id
1
2
3

sites table

id
1
2

site_memberships table

site_id user_id created_on
1 1 1
1 1 2
1 1 3
2 1 1
2 1 2
1 2 2
1 2 3

Assuming higher the created_on number, latest the record

Expected Output

site_id user_id created_on
1 1 3
2 1 2
1 2 3

Expected output: I need latest record for each user for each site membership.

Tried the following query, but this does not seem to work.

select * from users inner join
      (
        SELECT ROW_NUMBER () OVER (
            PARTITION BY sm.user_id,
            sm.created_on
        ), sm.* 
            from site_memberships sm 
            inner join sites s on sm.site_id=s.id
            ) site_memberships
ON site_memberships.user_id  = users.user_id where row_number=1```

Advertisement

Answer

I think you have overcomplicated the problem you want to solve.

You seem to want aggregation:

select site_id, user_id, max(created_on)
from site_memberships sm
group by site_id, user_id;

If you had additional columns that you wanted, you could use distinct on instead:

select distinct on (site_id, user_id) sm.*
from site_memberships sm
order by site_id, user_id, created_on desc;
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement