Skip to content
Advertisement

SQL. Get samples of data in group query

I have a simple table with only 2 columns:

| name | domain |

I need a SQL query to get this data:

| domain | names count | sample name 1 | ...2 | ...3 | ...4 | ...5 |

So, the first part is simple, just grouping by domain and getting names count. But the second part, where I need to take 5 samples (they can be random) of names in the group – I have no idea how to resolve it.

Advertisement

Answer

Below is for BigQuery Standard SQL

#standardSQL
SELECT domain, names_count,
  samples[OFFSET(0)] AS sample_name_1,
  samples[SAFE_OFFSET(1)] AS sample_name_2,
  samples[SAFE_OFFSET(2)] AS sample_name_3,
  samples[SAFE_OFFSET(3)] AS sample_name_4,
  samples[SAFE_OFFSET(4)] AS sample_name_5
FROM (
  SELECT domain, 
    COUNT(name) names_count,
    ARRAY_AGG(name ORDER BY RAND() LIMIT 5) samples
  FROM `project.dataset.table`
  GROUP BY domain
)
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement