Missing rows in full outer join

Question

I am trying to count how many users are observed on each of the 3 consecutive days. Each of the 3 intermediate tables (t0, t1, t2) has 2 columns: uid (unique ID) and d0 (or d1 or d2, which is 1 and indicates that the user is observed on that day). The following query: produces this output from spark.sql(q).to…

Accepted Answer

Regarding the original query the last FULL JOIN should take into account t0.uid could be null due to the first FULL JOIN so it must be OR not AND.select d0,d1,d2, count(*) as user_count from (   select uid, 1 as d0   from my_table   where day=5 and uid is not Null   group by uid) as t0 full outer join (   select uid, 1 as d1   from my_table   where day=6 and uid is not Null   group by uid) as t1 on t0.uid = t1.uidfull outer join (   select uid, 1 as d2   from my_table   where day=7 and uid is not Null   group by uid) as t2 on t0.uid = t2.uid or t1.uid = t2.uidgroup by d0,d1,d2 order by d0,d1,d2;SQL Server db<>fiddlePersonally I would stick with the Gordon Linoff&#8217;s solution.

Advertisement

Answer