Skip to content
Advertisement

aws athena SQL query; get unique count of column2 for each unique column1

I am fairly new to SQL queries, and am working with querying an aws athena database.

My first query will return every unique brand given some parameters:

-- query1
SELECT DISTINCT brand
FROM "database" 
WHERE product_gap = 'yes' AND store_name = 'petco'

Returning a list like this:

#   brand
1   Drs. Doink and Foink
2   Zkukit!
3   Willow
4   Freshz
5   Aquatic Land
6   UniLaws
7   Seraz
.....

And so on and so on for 553 rows. If I pick any of the brands, such as ‘Seraz’ and run this second query, it will return that distinct count of that brand’s ‘merch1’ column

-- query2
SELECT COUNT (DISTINCT merch1)
FROM "database" 
WHERE product_gap = 'yes' AND store_name = 'petco' AND brand='Seraz'

Which returns:

#   _col0
1   2

Where _col0 is the distinct count of merch1 for seraz.

My question is how can I combine my queries, so that for every unique brand result from query1, it runs query2, and presents the unique merch1 count in the table like so:

#   brand                 merch1_distinct_count
1   Drs. Doink and Foink  2
2   Zkukit!               1
3   Willow                1
4   Freshz                1
5   Aquatic Land          1
6   UniLaws               3
7   Seraz                 2
.....

Advertisement

Answer

use group by with brand column

SELECT brand, COUNT(DISTINCT merch1) merch1_distinct_count
FROM "database" 
WHERE product_gap = 'yes' AND store_name = 'petco'
group by brand
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement