Skip to content
Advertisement

How to JOIN two table in BigQuery, but the join parameter are nested

sorry for the newbie questions, just started learning SQL. I have two tables:

  1. sessions
  2. items

sessions table has questions (RECORD, Repeated), and inside questions there’s item_id (String)

items table has topics (RECORD, Repeated), and inside topics there’s prior_difficulty (String). items table also has item_id (String)

My objective is to get a list of sessions and its prior_difficulty, by joining the two tables with their item_id. TIA

Advertisement

Answer

You can first use the unnest() function to retrieve all the item_ids from the sessions table and then join them with the item_id from the items table.
To retrieve the prior_difficulty from your struct column topics, you can also use the unnest() function :

select distinct
  sessions.session_id,
  t.prior_difficulty
from sessions, unnest(questions) q
left join items on q.item_id = items.item_id, unnest(topics) t

or if you want to create a repeated record column to group prior_difficulty values by session_id :

select
  sessions.session_id,
  array_agg(distinct t.prior_difficulty ignore nulls) as prior_difficulties
from sessions, unnest(questions) q
left join items on q.item_id = items.item_id, unnest(topics) t
group by 1
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement