Skip to content
Advertisement

Need an SQL query that will left join with another table, which will in turn return the latest values based on time, grouped into a single row

I need help with a SELECT SQL query that will left join with another table on id column, which will in turn return the latest values based on time, grouped into a single row?

Basically, join the two tables in such a way that for each record in users table that exists in time_series table, return the latest values based on time grouped into a single row.

users table:

id  |  name
----+--------
 1  | "Joe" 
 2  | "Ron" 

time_series table:

time| id  |  a  |  b  | c  | d
----+-----+-----+-----+----+----
 1  |  1  |  a1 |     |    |
 2  |  1  |     |  b1 |    |
 3  |  1  |     |     | c0 |
 4  |  1  |  a3 |     | c3 |
 5  |  1  |  a0 |     |    |
 6  |  2  |  a3 |     | c3 | d1
 7  |  2  |  a2 |     |    | d3

The result should look like this:

id  |  a  |  b  | c  | d  | name
----+-----+-----+----+----+------
 1  |  a0 |  b1 | c3 |    | "Joe"
 2  |  a2 |     |    | d3 | "Ron"

Advertisement

Answer

One option uses a couple of subqueries:

select u.*,
    (select ts.a from time_series ts where ts.id = u.id and ts.a is not null order by ts.time desc limit 1) a,
    (select ts.b from time_series ts where ts.id = u.id and ts.b is not null order by ts.time desc limit 1) b,
    (select ts.c from time_series ts where ts.id = u.id and ts.c is not null order by ts.time desc limit 1) c,
    (select ts.d from time_series ts where ts.id = u.id and ts.d is not null order by ts.time desc limit 1) d
from users u

Another solution is to rank rows with row_number() for each column, giving priority to non-null values, and thenconditional aggregation:

select u.id,
    max(case when ts.rn_a = 1 then ts.a end) a,
    max(case when ts.rn_b = 1 then ts.b end) a,
    max(case when ts.rn_c = 1 then ts.c end) a,
    max(case when ts.rn_d = 1 then ts.d end) d
from users u
inner join (
    select ts.*,
        row_number() over(order by (case when a is null then 1 else 0 end), time desc) rn_a,
        row_number() over(order by (case when b is null then 1 else 0 end), time desc) rn_b,
        row_number() over(order by (case when c is null then 1 else 0 end), time desc) rn_c,
        row_number() over(order by (case when d is null then 1 else 0 end), time desc) rn_d
    from time_series ts
) ts on ts.id = u.id
group by u.id
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement