Skip to content
Advertisement

Distribute values across subsequent rows in Postgresql

I have two tables. One contains user state transitions.

create table state (
  id serial primary key,
  ctime timestamp with time zone not null,
  state text not null
);

insert into state (ctime, state) values
  ('2019-05-01 03:58:40+00', 'Busy')
, ('2019-05-01 03:58:42+00', 'Ready')
, ('2019-05-01 04:00:14+00', 'Busy')
, ('2019-05-01 04:16:26+00', 'Ready')
, ('2019-05-01 04:16:36+00', 'Busy')
;

And the other one contains user’s actions.

create table action (
  id serial primary key,
  ctime timestamp with time zone not null,
  action text not null
);

insert into action (ctime, action) values
  ('2019-05-01 03:58:42+00', 'vasah')
, ('2019-05-01 03:58:42+00', 'mituh')
, ('2019-05-01 04:00:14+00', 'jumuf')
, ('2019-05-01 04:00:16+00', 'vibaj')
, ('2019-05-01 04:00:16+00', 'sasij')
, ('2019-05-01 04:16:21+00', 'husih')
, ('2019-05-01 04:16:26+00', 'radod')
, ('2019-05-01 04:16:30+00', 'zadub')
, ('2019-05-01 04:16:35+00', 'mimoh')
, ('2019-05-01 04:16:36+00', 'rimoh')
, ('2019-05-01 04:16:37+00', 'zahuf')
, ('2019-05-01 04:16:37+00', 'fisak')
;

It is easy to union these tables and visually see in which state an action was performed.

select *
from (
  select ctime, state, null from state
   union all
     select ctime, null, action from action) x
order by ctime;

Output:

 2019-05-01 06:58:40+03 | Busy  |
 2019-05-01 06:58:42+03 |       | mituh
 2019-05-01 06:58:42+03 |       | vasah
 2019-05-01 06:58:42+03 | Ready |
 2019-05-01 07:00:14+03 |       | jumuf
 2019-05-01 07:00:14+03 | Busy  |
 2019-05-01 07:00:16+03 |       | vibaj
 2019-05-01 07:00:16+03 |       | sasij
 2019-05-01 07:16:21+03 |       | husih
 2019-05-01 07:16:26+03 | Ready |
 2019-05-01 07:16:26+03 |       | radod
 2019-05-01 07:16:30+03 |       | zadub
 2019-05-01 07:16:35+03 |       | mimoh
 2019-05-01 07:16:36+03 | Busy  |
 2019-05-01 07:16:36+03 |       | rimoh
 2019-05-01 07:16:37+03 |       | zahuf
 2019-05-01 07:16:37+03 |       | fisak

How can I “fill the blanks” so the output will be like this?

 2019-05-01 06:58:40+03 | Busy  |
 2019-05-01 06:58:42+03 | Busy  | mituh
 2019-05-01 06:58:42+03 | Busy  | vasah
 2019-05-01 06:58:42+03 | Ready |
 2019-05-01 07:00:14+03 | Ready | jumuf
 2019-05-01 07:00:14+03 | Busy  |
 2019-05-01 07:00:16+03 | Busy  | vibaj
 2019-05-01 07:00:16+03 | Busy  | sasij
 2019-05-01 07:16:21+03 | Busy  | husih
 2019-05-01 07:16:26+03 | Ready |
 2019-05-01 07:16:26+03 | Ready | radod
 2019-05-01 07:16:30+03 | Ready | zadub
 2019-05-01 07:16:35+03 | Ready | mimoh
 2019-05-01 07:16:36+03 | Busy  |
 2019-05-01 07:16:36+03 | Busy  | rimoh
 2019-05-01 07:16:37+03 | Busy  | zahuf
 2019-05-01 07:16:37+03 | Busy  | fisak

Advertisement

Answer

This is a place where lag(ignore nulls) is really useful. But not available in Postgres. So you can do this in two steps. Assign groups based on state. Then spread the value:

select t.*,
       max(state) over (partition by grp) as imputed_state
from (select t.*, count(state) over (order by ctime) as grp
      from t
     ) t;

Note that t is actually your union all query. I’ve abstracted it to a single table alias to illustrate the important part of the logic.

Also, you have ties in your data (rows with the same time). That means that the answer is indeterminate — ties can go either before or after the state with the same time.

Here is a db<>fiddle.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement