Skip to content
Advertisement

Join the tables one to many but duplicate the records bigquery

i’m working on bigquery. I have two tables: one for sites and one for site logs. I want to make a query that selects all the sites (without duplicating them) and tells me the last status by date. I also want to know what answer team 1 or team 2 gave for each site. I do left join and duplicate everything, I already tried doing subqueries but it gives me an error. how can i solve it?

Table sites

ID  name    link
1   ff     sa.com
2   rg     knakans.com
3   g      aklsfndkl.com
4   gg     a.com
5   g      g.com

table siteLogs

ID  IDsite  Status  timestamp   responde   submittedBy
1   2       backlog     20/2    denied      team1
2   1       pending     20/3    null        team2
3   3       pending     20/4    null        team2
4   3       approved    20/5    ok          team1
5   3       submitted   20/6    ok          team2

My query

select
id, 
name, 
link, 
  FIRST_VALUE(b.status) OVER(PARTITION BY b.IDsite ORDER BY b.timestamp DESC) as lastStatus,
case when (submittedBy='team1')  then response else "" end  as  team1_response,
case when (submittedBy='team2')  then response else "" end  as  team2_response,

from sites a 
left join siteLogs b
where b.IDsite=a.ID

what I hope to get

ID  name    link           lastStatus   team1_response  team2_response
1   ff      sa.com         pending      null             ok
2   rg      knakans.com    backlog      denied           null
3   g       aklsfndkl.com  approved     ok               ok
4   gg      a.com          null         null             null
5   g       g.com          null         null             null 

Advertisement

Answer

Try below

select a.id, name, link, 
  array_agg(status order by timestamp desc limit 1)[offset(0)] lastStatus,
  array_agg(if(submittedBy = 'team1', response, null) order by if(submittedBy = 'team1', timestamp, null) desc limit 1)[offset(0)] team1_response,
  array_agg(if(submittedBy = 'team2', response, null) order by if(submittedBy = 'team2', timestamp, null) desc limit 1)[offset(0)] team2_response
from sites a
left join siteLogs b
on a.id = b.idsite
group by a.id, name, link           

if applied to sample data in your question – output is

enter image description here

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement