Skip to content
Advertisement

How to GROUP BY first column and choose string from second one depends on number in third one?

I have a problem with GROUP BY one column and choose second column that is string depends on Count number from column three.

So I have a table with ID’s in column one, string in column two and Count in column three. I have ordered that by ID’s and Count descending.

Most of the ID’s are unique but sometimes id’s occurs more than once. In this case I would like to choose only string with bigger count number. How can I do that?

SELECT id, string, count
FROM ...
ORDER BY id, count DESC

Advertisement

Answer

In BigQuery, you can use aggregation:

select array_agg(t order by count desc limit 1)[ordinal(1)].*
from t
group by id;

What this does is construct an array of the full records for each id. But this array is ordered by the largest count first — and only the first element of the array is used. The [ordinal(1)].* is just a convenient way to return the record fields as separate columns.

The more canonical method in SQL would be:

select t.* except (seqnum)
from (select t.*,
             row_number() over (partition by id order by count desc) as seqnum
      from t
     ) t
where seqnum = 1;
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement