Skip to content
Advertisement

BigQuery – JOIN on two tables using string and array

We have two tables in BigQuery like below:

Table A

 Name | Question  | Answer
 -----+-----------+-------
 Bob  | Interest  | a
 Bob  | Interest  | b
 Sue  | Interest  | a
 Sue  | Interest  | c
 Joe  | Interest  | a
 Joe  | Interest  | b
 Joe  | Interest  | c
 Joe  | Interest  | d

Table B (Static)

           Interests                        |   Segment
--------------------------------------------+------------------
["a"]                                       |   S1
["a","b"]                                   |   S2 
["a", "b", "c", "d"]                        |   S3

Expected table

 User | Question  | Answer
 -----+-----------+-------
 Bob  | Interest  | a
 Bob  | Interest  | b
 Sue  | Interest  | a
 Sue  | Interest  | c
 Joe  | Interest  | a
 Joe  | Interest  | b
 Joe  | Interest  | c
 Joe  | Interest  | d
          (+)
 Bob  | Segment   | S1
 Bob  | Segment   | S2
 Sue  | Segment   | S1
 Joe  | Segment   | S1
 Joe  | Segment   | S2
 Joe  | Segment   | S3 

In the above tables, the Answer field is of string type and Interests is of array type.

Pointers:

  1. One user can have one or more interests.
  2. One or more interests can belong to one segment.
  3. A user will be assigned to a segment only when all of his interests are matched.

Any inputs/thoughts in the right direction would be greatly appreciated.

Advertisement

Answer

Below is for BigQuery Standard SQL

#standardSQL
select name, question, answer from `project.dataset.tableA`
union all
select name, 'Segment', segment
from (
  select 
    name, 'Segment', segment,
    ( select countif(y is null)
      from b.interest x
      left join a.answers y
      on x = y
    ) = 0 qualified
  from (
    select name, array_agg(answer) answers
    from `project.dataset.tableA`
    group by name
  ) a, `project.dataset.tableB` b
)
where qualified    

if to apply to sample data from your question – output is

enter image description here

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement