Delete repeated data in Bigquery

Question

I am optimizing a query in Bigquery that shows non-repeated data, currently it is like this and it works. select * from (select ROW_NUMBER() OVER (PARTITION BY id) as num, id, ...

Accepted Answer

Below is for BigQuery Standard SQL   #standardSQLSELECT AS VALUE ARRAY_AGG(t ORDER BY created_at LIMIT 1)[OFFSET(0)]FROM `project.dataset.NAME_TABLE` tWHERE created_at >='2018-01-01'GROUP BY idInstead of processing / returning all columns &#8211; you can specify exact list you need as in below example   #standardSQLSELECT AS VALUE ARRAY_AGG(STRUCT(id,created_at,operator_id,description) ORDER BY created_at LIMIT 1)[OFFSET(0)]FROM `project.dataset.NAME_TABLE`WHERE created_at >='2018-01-01'GROUP BY id

Advertisement

Answer