Say I have a table of people who have used my service on a day N and a table describing what theme the users have changed to. There’s no table available telling me what theme they are currently using. What I would like to do is to see that on a daily basis.
So let’s say the changes table looks like this.
| user_ID | date | theme | |---------|----------|-------| | user1 | 1.1.2021 | Dark | | user1 | 4.1.2021 | Light | | user2 | 2.1.2021 | Dark | | user2 | 6.1.2021 | Light |
The activity table has just the user_ID and a date they accessed the service reported.
| user_ID | date | |---------|----------| | user1 | 1.1.2021 | | user1 | 2.1.2021 | | user1 | 3.1.2021 | | user1 | 4.1.2021 | | user1 | 5.1.2021 | | user1 | 6.1.2021 | | user2 | 2.1.2021 | | user2 | 3.1.2021 | | user2 | 4.1.2021 | | user2 | 5.1.2021 | | user2 | 6.1.2021 |
Now what I would like to do is join the first table to the second one so the theme they are using on an active date would be listed there.
| user_ID | date | theme | |---------|----------|-------| | user1 | 1.1.2021 | Dark | | user1 | 2.1.2021 | Dark | | user1 | 3.1.2021 | Dark | | user1 | 4.1.2021 | Light | | user1 | 5.1.2021 | Light | | user1 | 6.1.2021 | Light | | user2 | 2.1.2021 | Dark | | user2 | 3.1.2021 | Dark | | user2 | 4.1.2021 | Dark | | user2 | 5.1.2021 | Dark | | user2 | 6.1.2021 | Light |
How do I achieve this? Assume there can be an unlimited amount of themes.
Advertisement
Answer
One method is a correlated subquery, but I’m not sure if Presto supports this:
select a.*, (select c.theme from changes c where c.user_id = a.user_id and c.date <= a.date order by c.date desc limit 1 ) as theme from activity a;
A perhaps more efficient method is to use left join
but to calculate the “end” date for each change:
select a.*, c.theme from activity a left join (select c.*, lead(date) over (partition by user_id order by date) as next_date from changes c ) c on a.user_id = c.user_id and a.date >= c.date and (a.date < c.next_date or c.next_date is null);