I am fairly new to SQL queries, and am working with querying an aws athena database.
My first query will return every unique brand given some parameters:
-- query1 SELECT DISTINCT brand FROM "database" WHERE product_gap = 'yes' AND store_name = 'petco'
Returning a list like this:
# brand 1 Drs. Doink and Foink 2 Zkukit! 3 Willow 4 Freshz 5 Aquatic Land 6 UniLaws 7 Seraz .....
And so on and so on for 553 rows. If I pick any of the brands, such as ‘Seraz’ and run this second query, it will return that distinct count of that brand’s ‘merch1’ column
-- query2 SELECT COUNT (DISTINCT merch1) FROM "database" WHERE product_gap = 'yes' AND store_name = 'petco' AND brand='Seraz'
Which returns:
# _col0 1 2
Where _col0 is the distinct count of merch1 for seraz.
My question is how can I combine my queries, so that for every unique brand result from query1, it runs query2, and presents the unique merch1 count in the table like so:
# brand merch1_distinct_count 1 Drs. Doink and Foink 2 2 Zkukit! 1 3 Willow 1 4 Freshz 1 5 Aquatic Land 1 6 UniLaws 3 7 Seraz 2 .....
Advertisement
Answer
use group by
with brand column
SELECT brand, COUNT(DISTINCT merch1) merch1_distinct_count FROM "database" WHERE product_gap = 'yes' AND store_name = 'petco' group by brand