i’m working on bigquery. I have two tables: one for sites and one for site logs. I want to make a query that selects all the sites (without duplicating them) and tells me the last status by date. I also want to know what answer team 1 or team 2 gave for each site. I do left join and duplicate everything, I already tried doing subqueries but it gives me an error. how can i solve it?
Table sites
ID name link 1 ff sa.com 2 rg knakans.com 3 g aklsfndkl.com 4 gg a.com 5 g g.com
table siteLogs
ID IDsite Status timestamp responde submittedBy 1 2 backlog 20/2 denied team1 2 1 pending 20/3 null team2 3 3 pending 20/4 null team2 4 3 approved 20/5 ok team1 5 3 submitted 20/6 ok team2
My query
select id, name, link, FIRST_VALUE(b.status) OVER(PARTITION BY b.IDsite ORDER BY b.timestamp DESC) as lastStatus, case when (submittedBy='team1') then response else "" end as team1_response, case when (submittedBy='team2') then response else "" end as team2_response, from sites a left join siteLogs b where b.IDsite=a.ID
what I hope to get
ID name link lastStatus team1_response team2_response 1 ff sa.com pending null ok 2 rg knakans.com backlog denied null 3 g aklsfndkl.com approved ok ok 4 gg a.com null null null 5 g g.com null null null
Advertisement
Answer
Try below
select a.id, name, link, array_agg(status order by timestamp desc limit 1)[offset(0)] lastStatus, array_agg(if(submittedBy = 'team1', response, null) order by if(submittedBy = 'team1', timestamp, null) desc limit 1)[offset(0)] team1_response, array_agg(if(submittedBy = 'team2', response, null) order by if(submittedBy = 'team2', timestamp, null) desc limit 1)[offset(0)] team2_response from sites a left join siteLogs b on a.id = b.idsite group by a.id, name, link
if applied to sample data in your question – output is