I have a query as an example below, but its running forever .. If I do union then I have copy entire large query .. Any help on this fix
select columns from table a join (select * from ( select *, row_number() over( partition BY id ORDER BY date ) row_seq from table b ) where row_seq = 1 ) o ON a.id = o.id OR a.id = o.src_id left join table p on a.partner_c = p.id left join table pp on a.parnt_id = pp.id
Advertisement
Answer
The inner part
select * from ( select *, row_number() over( partition BY id ORDER BY date ) row_seq from table b ) where row_seq = 1
can be written using the QUALIFY clause like so:
select *, from table b qualify row_number() over( partition BY id ORDER BY date ) = 1
which can be pushed into a CTE like:
WITH raw_rows AS ( select *, from table b qualify row_number() over( partition BY id ORDER BY date ) = 1 ) select columns from table a join raw_rows o ON (a.id = o.id OR a.id = o.src_id) left join table p on a.partner_c = p.id left join table pp on a.parnt_id = pp.id
which means you can union that whole thing or just the table a
and raw_rows o
WITH raw_rows AS ( select *, from table b qualify row_number() over( partition BY id ORDER BY date ) = 1 ), a_and_o AS ( select a.columns ,o.columns from table a join raw_rows o ON a.id = o.id UNION ALL select a.columns ,o.columns from table a join raw_rows o ON a.id = o.src_id ) select columns from a_and_o a left join table p on a.partner_c = p.id left join table pp on a.parnt_id = pp.id
but this will give you a slightly different result if o.src_id == o.id
for any row, as it will match both verse the OR vesion will have those lines once. You can pay the GROUP BY cost by swapping to UNION to de-dup, but then if you have many lines you want to keep those will be lost. Or you can change second half of the union to not included those line with
join raw_rows o ON a.id = o.src_id AND o.src_id != o.id
depends on your data and needs.