I need help with a SELECT SQL query that will left join with another table on id column, which will in turn return the latest values based on time, grouped into a single row?
Basically, join the two tables in such a way that for each record in users table that exists in time_series table, return the latest values based on time grouped into a single row.
users table:
id | name ----+-------- 1 | "Joe" 2 | "Ron"
time_series table:
time| id | a | b | c | d ----+-----+-----+-----+----+---- 1 | 1 | a1 | | | 2 | 1 | | b1 | | 3 | 1 | | | c0 | 4 | 1 | a3 | | c3 | 5 | 1 | a0 | | | 6 | 2 | a3 | | c3 | d1 7 | 2 | a2 | | | d3
The result should look like this:
id | a | b | c | d | name ----+-----+-----+----+----+------ 1 | a0 | b1 | c3 | | "Joe" 2 | a2 | | | d3 | "Ron"
Advertisement
Answer
One option uses a couple of subqueries:
select u.*, (select ts.a from time_series ts where ts.id = u.id and ts.a is not null order by ts.time desc limit 1) a, (select ts.b from time_series ts where ts.id = u.id and ts.b is not null order by ts.time desc limit 1) b, (select ts.c from time_series ts where ts.id = u.id and ts.c is not null order by ts.time desc limit 1) c, (select ts.d from time_series ts where ts.id = u.id and ts.d is not null order by ts.time desc limit 1) d from users u
Another solution is to rank rows with row_number()
for each column, giving priority to non-null
values, and thenconditional aggregation:
select u.id, max(case when ts.rn_a = 1 then ts.a end) a, max(case when ts.rn_b = 1 then ts.b end) a, max(case when ts.rn_c = 1 then ts.c end) a, max(case when ts.rn_d = 1 then ts.d end) d from users u inner join ( select ts.*, row_number() over(order by (case when a is null then 1 else 0 end), time desc) rn_a, row_number() over(order by (case when b is null then 1 else 0 end), time desc) rn_b, row_number() over(order by (case when c is null then 1 else 0 end), time desc) rn_c, row_number() over(order by (case when d is null then 1 else 0 end), time desc) rn_d from time_series ts ) ts on ts.id = u.id group by u.id