Distribute values across subsequent rows in Postgresql

I have two tables. One contains user state transitions.

create table state (
  id serial primary key,
  ctime timestamp with time zone not null,
  state text not null
);

insert into state (ctime, state) values
  ('2019-05-01 03:58:40+00', 'Busy')
, ('2019-05-01 03:58:42+00', 'Ready')
, ('2019-05-01 04:00:14+00', 'Busy')
, ('2019-05-01 04:16:26+00', 'Ready')
, ('2019-05-01 04:16:36+00', 'Busy')
;

​x
 
create table state (  id serial primary key,  ctime timestamp with time zone not null,  state text not null);​insert into state (ctime, state) values  ('2019-05-01 03:58:40+00', 'Busy'), ('2019-05-01 03:58:42+00', 'Ready'), ('2019-05-01 04:00:14+00', 'Busy'), ('2019-05-01 04:16:26+00', 'Ready'), ('2019-05-01 04:16:36+00', 'Busy');​

And the other one contains user’s actions.

create table action (
  id serial primary key,
  ctime timestamp with time zone not null,
  action text not null
);

insert into action (ctime, action) values
  ('2019-05-01 03:58:42+00', 'vasah')
, ('2019-05-01 03:58:42+00', 'mituh')
, ('2019-05-01 04:00:14+00', 'jumuf')
, ('2019-05-01 04:00:16+00', 'vibaj')
, ('2019-05-01 04:00:16+00', 'sasij')
, ('2019-05-01 04:16:21+00', 'husih')
, ('2019-05-01 04:16:26+00', 'radod')
, ('2019-05-01 04:16:30+00', 'zadub')
, ('2019-05-01 04:16:35+00', 'mimoh')
, ('2019-05-01 04:16:36+00', 'rimoh')
, ('2019-05-01 04:16:37+00', 'zahuf')
, ('2019-05-01 04:16:37+00', 'fisak')
;

 
create table action (  id serial primary key,  ctime timestamp with time zone not null,  action text not null);​insert into action (ctime, action) values  ('2019-05-01 03:58:42+00', 'vasah'), ('2019-05-01 03:58:42+00', 'mituh'), ('2019-05-01 04:00:14+00', 'jumuf'), ('2019-05-01 04:00:16+00', 'vibaj'), ('2019-05-01 04:00:16+00', 'sasij'), ('2019-05-01 04:16:21+00', 'husih'), ('2019-05-01 04:16:26+00', 'radod'), ('2019-05-01 04:16:30+00', 'zadub'), ('2019-05-01 04:16:35+00', 'mimoh'), ('2019-05-01 04:16:36+00', 'rimoh'), ('2019-05-01 04:16:37+00', 'zahuf'), ('2019-05-01 04:16:37+00', 'fisak');​

It is easy to union these tables and visually see in which state an action was performed.

select *
from (
  select ctime, state, null from state
   union all
     select ctime, null, action from action) x
order by ctime;

 
select *from (  select ctime, state, null from state   union all     select ctime, null, action from action) xorder by ctime;​

Output:

 2019-05-01 06:58:40+03 | Busy  |
 2019-05-01 06:58:42+03 |       | mituh
 2019-05-01 06:58:42+03 |       | vasah
 2019-05-01 06:58:42+03 | Ready |
 2019-05-01 07:00:14+03 |       | jumuf
 2019-05-01 07:00:14+03 | Busy  |
 2019-05-01 07:00:16+03 |       | vibaj
 2019-05-01 07:00:16+03 |       | sasij
 2019-05-01 07:16:21+03 |       | husih
 2019-05-01 07:16:26+03 | Ready |
 2019-05-01 07:16:26+03 |       | radod
 2019-05-01 07:16:30+03 |       | zadub
 2019-05-01 07:16:35+03 |       | mimoh
 2019-05-01 07:16:36+03 | Busy  |
 2019-05-01 07:16:36+03 |       | rimoh
 2019-05-01 07:16:37+03 |       | zahuf
 2019-05-01 07:16:37+03 |       | fisak

 
 2019-05-01 06:58:40+03 | Busy  | 2019-05-01 06:58:42+03 |       | mituh 2019-05-01 06:58:42+03 |       | vasah 2019-05-01 06:58:42+03 | Ready | 2019-05-01 07:00:14+03 |       | jumuf 2019-05-01 07:00:14+03 | Busy  | 2019-05-01 07:00:16+03 |       | vibaj 2019-05-01 07:00:16+03 |       | sasij 2019-05-01 07:16:21+03 |       | husih 2019-05-01 07:16:26+03 | Ready | 2019-05-01 07:16:26+03 |       | radod 2019-05-01 07:16:30+03 |       | zadub 2019-05-01 07:16:35+03 |       | mimoh 2019-05-01 07:16:36+03 | Busy  | 2019-05-01 07:16:36+03 |       | rimoh 2019-05-01 07:16:37+03 |       | zahuf 2019-05-01 07:16:37+03 |       | fisak​

How can I “fill the blanks” so the output will be like this?

 2019-05-01 06:58:40+03 | Busy  |
 2019-05-01 06:58:42+03 | Busy  | mituh
 2019-05-01 06:58:42+03 | Busy  | vasah
 2019-05-01 06:58:42+03 | Ready |
 2019-05-01 07:00:14+03 | Ready | jumuf
 2019-05-01 07:00:14+03 | Busy  |
 2019-05-01 07:00:16+03 | Busy  | vibaj
 2019-05-01 07:00:16+03 | Busy  | sasij
 2019-05-01 07:16:21+03 | Busy  | husih
 2019-05-01 07:16:26+03 | Ready |
 2019-05-01 07:16:26+03 | Ready | radod
 2019-05-01 07:16:30+03 | Ready | zadub
 2019-05-01 07:16:35+03 | Ready | mimoh
 2019-05-01 07:16:36+03 | Busy  |
 2019-05-01 07:16:36+03 | Busy  | rimoh
 2019-05-01 07:16:37+03 | Busy  | zahuf
 2019-05-01 07:16:37+03 | Busy  | fisak

 
 2019-05-01 06:58:40+03 | Busy  | 2019-05-01 06:58:42+03 | Busy  | mituh 2019-05-01 06:58:42+03 | Busy  | vasah 2019-05-01 06:58:42+03 | Ready | 2019-05-01 07:00:14+03 | Ready | jumuf 2019-05-01 07:00:14+03 | Busy  | 2019-05-01 07:00:16+03 | Busy  | vibaj 2019-05-01 07:00:16+03 | Busy  | sasij 2019-05-01 07:16:21+03 | Busy  | husih 2019-05-01 07:16:26+03 | Ready | 2019-05-01 07:16:26+03 | Ready | radod 2019-05-01 07:16:30+03 | Ready | zadub 2019-05-01 07:16:35+03 | Ready | mimoh 2019-05-01 07:16:36+03 | Busy  | 2019-05-01 07:16:36+03 | Busy  | rimoh 2019-05-01 07:16:37+03 | Busy  | zahuf 2019-05-01 07:16:37+03 | Busy  | fisak​

Answer

This is a place where lag(ignore nulls) is really useful. But not available in Postgres. So you can do this in two steps. Assign groups based on state. Then spread the value:

select t.*,
       max(state) over (partition by grp) as imputed_state
from (select t.*, count(state) over (order by ctime) as grp
      from t
     ) t;

 
select t.*,       max(state) over (partition by grp) as imputed_statefrom (select t.*, count(state) over (order by ctime) as grp      from t     ) t;​

Note that t is actually your union all query. I’ve abstracted it to a single table alias to illustrate the important part of the logic.

Also, you have ties in your data (rows with the same time). That means that the answer is indeterminate — ties can go either before or after the state with the same time.

Here is a db<>fiddle.

Advertisement

Answer