Grouping the result set based on conditions

Question

I am calculating Age of a user based on his date of birth. This gives me the UserId and his age. Now I want to group this result. How many users are in 30&#8217;s, How many users in 40&#8217;s and how many users in their 50&#8217;s.. need the count of users with their age groups If the user is >

Accepted Answer

I believe this will get you what you want.Anything < 30 will be placed in the ’20’ group.Anything >= 50 will be placed in the ’50’ group.If they are 30-39 or 40-49, they will be placed in their appropriate decade group.SELECT y.AgeDecade, COUNT(*)FROM dbo.[User] u CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x CROSS APPLY (SELECT AgeDecade = CASE WHEN x.Age <= 29 THEN 20 WHEN x.Age BETWEEN 30 AND 39 THEN 30 WHEN x.Age BETWEEN 40 AND 49 THEN 40 WHEN x.Age >= 50 THEN 50 ELSE NULL END ) yGROUP BY y.AgeDecadePlacing the logic into CROSS APPLY makes it easier to reuse the logic within the same query, this way you can use it in SELECT, GROUP BY, ORDER BY, WHERE, etc, without having to duplicate it. This could also be done using a cte, but in this scenario, this is my preference.Update:You asked in the comments how it would be possible to show a count of 0 when no people exist for an age group. In most cases the answer is simple, LEFT JOIN. As with everything, there’s always more than one way to bake a cake.Here are a couple ways you can do it:The simple left join, take the query from my original answer, and just do a left join to a table. You could do this in a couple ways, CTE, temp table, table variable, sub-query, etc. The takeaway is, you need to isolate your User table somehow.Simple Sub-query method, nothing fancy. Just stuck the whole query into a sub-query, then left joined it to our new lookup table.DECLARE @AgeGroup TABLE (AgeGroupID tinyint NOT NULL);INSERT INTO @AgeGroup (AgeGroupID) VALUES (20),(30),(40),(50);SELECT ag.AgeGroupID, TotalCount = COUNT(a.AgeDecade)FROM @AgeGroup ag LEFT JOIN ( SELECT y.AgeDecade FROM #User u CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x CROSS APPLY (SELECT AgeDecade = CASE WHEN x.Age <= 29 THEN 20 WHEN x.Age BETWEEN 30 AND 39 THEN 30 WHEN x.Age BETWEEN 40 AND 49 THEN 40 WHEN x.Age >= 50 THEN 50 ELSE NULL END ) y ) a ON a.AgeDecade = ag.AgeGroupIDGROUP BY ag.AgeGroupID;This would be the exact same thing as using a cte:DECLARE @AgeGroup TABLE (AgeGroupID tinyint NOT NULL);INSERT INTO @AgeGroup (AgeGroupID) VALUES (20),(30),(40),(50);WITH cte_Users AS ( SELECT y.AgeDecade FROM #User u CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x CROSS APPLY (SELECT AgeDecade = CASE WHEN x.Age <= 29 THEN 20 WHEN x.Age BETWEEN 30 AND 39 THEN 30 WHEN x.Age BETWEEN 40 AND 49 THEN 40 WHEN x.Age >= 50 THEN 50 ELSE NULL END ) y)SELECT ag.AgeGroupID, TotalCount = COUNT(a.AgeDecade)FROM @AgeGroup ag LEFT JOIN cte_Users a ON a.AgeDecade = ag.AgeGroupIDGROUP BY ag.AgeGroupID;Choosing between the two is purely preference. There’s no performance gain to using a CTE here.Bonus:If you wanted to table drive your groups and also have 0 counts, you could do something like this…though I will warn you to be careful using APPLY operators because they can be tricky with performance sometimes.IF OBJECT_ID('tempdb..#User','U') IS NOT NULL DROP TABLE #User; --SELECT * FROM #UserWITH c1 AS (SELECT x.x FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) x(x)) -- 10 , c2(x) AS (SELECT 1 FROM c1 x CROSS JOIN c1 y) -- 10 * 10SELECT UserID = IDENTITY(int,1,1) , DateOfBirth = CONVERT(date, GETDATE()-(RAND(CHECKSUM(NEWID()))*18500))INTO #UserFROM c2 u;IF OBJECT_ID('tempdb..#AgeRange','U') IS NOT NULL DROP TABLE #AgeRange; --SELECT * FROM #AgeRangeCREATE TABLE #AgeRange ( AgeRangeID tinyint NOT NULL IDENTITY(1,1), RangeStart tinyint NOT NULL, RangeEnd tinyint NOT NULL, RangeLabel varchar(100) NOT NULL,);INSERT INTO #AgeRange (RangeStart, RangeEnd, RangeLabel)VALUES ( 0, 29, '< 29') , (30, 39, '30 - 39') , (40, 49, '40 - 49') , (50, 255, '50+');-- Using an OUTER APPLYSELECT ar.RangeLabel, COUNT(y.UserID)FROM #AgeRange ar OUTER APPLY ( SELECT u.UserID FROM #User u CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x WHERE x.Age BETWEEN ar.RangeStart AND ar.RangeEnd ) yGROUP BY ar.RangeLabel, ar.RangeStartORDER BY ar.RangeStart;-- Using a CTEWITH cte_users AS ( SELECT u.UserID , Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000 FROM #User u)SELECT ar.RangeLabel, COUNT(u.UserID)FROM #AgeRange ar LEFT JOIN cte_users u ON u.Age BETWEEN ar.RangeStart AND ar.RangeEndGROUP BY ar.RangeStart, ar.RangeLabelORDER BY ar.RangeStart;

Advertisement

Answer

Update:

Bonus: