I have a large dataset in SQL Server that contains two date fields and a foreign key field (among others):
+----+---------+-------------+-------------+ | id | type_id | report_date | import_date | +----+---------+-------------+-------------+ | 1 | 1 | 2021-08-01 | 2021-08-02 | | 2 | 1 | 2021-08-01 | 2021-08-02 | | 3 | 2 | 2021-08-01 | 2021-08-02 | | 4 | 2 | 2021-08-04 | 2021-08-05 | | 5 | 1 | 2021-08-04 | 2021-08-05 | | 6 | 3 | 2021-08-04 | 2021-08-05 | | 7 | 2 | 2021-08-04 | 2021-08-04 | +----+---------+-------------+-------------+
I need a query that can count the number of rows for each day, but to count them for each distinct value in the type_id
column. The closest I can wrap my brain around right now is returning the total count of all rows for a particular date:
select count(REPORT_DATE) as records, REPORT_DATE as report_date from MY_TABLE group by REPORT_DATE;
How do I split this up to return an additional column for each value of type_id
? For example:
+------------+---------+---------+---------+ | date | count_1 | count_2 | count_3 | +------------+---------+---------+---------+ | 2021-08-01 | 2 | 1 | 0 | | 2021-08-04 | 1 | 2 | 1 | +------------+---------+---------+---------+
I assume I will need at least one subquery, but that is beyond my current knowledge.
Advertisement
Answer
You can use a PIVOT
or a conditional aggregate.
PIVOT:
;WITH src AS ( SELECT report_date, type_id, c = COUNT(*) FROM dbo.MY_TABLE GROUP BY report_date, type_id ) SELECT report_date, count_1 = COALESCE([1], 0), count_2 = COALESCE([2], 0), count_3 = COALESCE([3], 0) FROM src PIVOT (MAX(c) FOR type_id IN ([1],[2],[3])) AS p;
Conditional aggregate:
SELECT report_date, count_1 = SUM(CASE WHEN type_id = 1 THEN 1 ELSE 0 END), count_2 = SUM(CASE WHEN type_id = 2 THEN 1 ELSE 0 END), count_3 = SUM(CASE WHEN type_id = 3 THEN 1 ELSE 0 END) FROM dbo.MY_TABLE GROUP BY report_date;
Example db<>fiddle
Use whichever one is more intuitive for you to learn and understand, though I will say conditional aggregate is certainly easier to maintain when, say, type_id = 4
shows up later, as @jarlh suggested in a comment.