I am calculating Age of a user based on his date of birth.
select UserId, (Convert(int,Convert(char(8),GETDATE(),112))-Convert(char(8),[DateOfBirth],112))/10000 AS [Age] FROM dbo.[User]
This gives me the UserId
and his age.
Now I want to group this result.
How many users are in 30’s, How many users in 40’s and how many users in their 50’s.. need the count of users with their age groups
If the user is > 0 and less than 30 he should be grouped to 20’s
If the user is >= 30 and < 40 then he should be added to 30’s list, same with 40’s and 50’s
Can this be achieved without creating any temp table?
Advertisement
Answer
I believe this will get you what you want.
Anything < 30 will be placed in the ’20’ group. Anything >= 50 will be placed in the ’50’ group.
If they are 30-39 or 40-49, they will be placed in their appropriate decade group.
SELECT y.AgeDecade, COUNT(*) FROM dbo.[User] u CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x CROSS APPLY (SELECT AgeDecade = CASE WHEN x.Age <= 29 THEN 20 WHEN x.Age BETWEEN 30 AND 39 THEN 30 WHEN x.Age BETWEEN 40 AND 49 THEN 40 WHEN x.Age >= 50 THEN 50 ELSE NULL END ) y GROUP BY y.AgeDecade
Placing the logic into CROSS APPLY
makes it easier to reuse the logic within the same query, this way you can use it in SELECT, GROUP BY, ORDER BY, WHERE, etc, without having to duplicate it. This could also be done using a cte, but in this scenario, this is my preference.
Update:
You asked in the comments how it would be possible to show a count of 0 when no people exist for an age group. In most cases the answer is simple, LEFT JOIN
. As with everything, there’s always more than one way to bake a cake.
Here are a couple ways you can do it:
The simple left join, take the query from my original answer, and just do a left join to a table. You could do this in a couple ways, CTE, temp table, table variable, sub-query, etc. The takeaway is, you need to isolate your User table somehow.
Simple Sub-query method, nothing fancy. Just stuck the whole query into a sub-query, then left joined it to our new lookup table.
DECLARE @AgeGroup TABLE (AgeGroupID tinyint NOT NULL); INSERT INTO @AgeGroup (AgeGroupID) VALUES (20),(30),(40),(50); SELECT ag.AgeGroupID, TotalCount = COUNT(a.AgeDecade) FROM @AgeGroup ag LEFT JOIN ( SELECT y.AgeDecade FROM #User u CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x CROSS APPLY (SELECT AgeDecade = CASE WHEN x.Age <= 29 THEN 20 WHEN x.Age BETWEEN 30 AND 39 THEN 30 WHEN x.Age BETWEEN 40 AND 49 THEN 40 WHEN x.Age >= 50 THEN 50 ELSE NULL END ) y ) a ON a.AgeDecade = ag.AgeGroupID GROUP BY ag.AgeGroupID;
This would be the exact same thing as using a cte:
DECLARE @AgeGroup TABLE (AgeGroupID tinyint NOT NULL); INSERT INTO @AgeGroup (AgeGroupID) VALUES (20),(30),(40),(50); WITH cte_Users AS ( SELECT y.AgeDecade FROM #User u CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x CROSS APPLY (SELECT AgeDecade = CASE WHEN x.Age <= 29 THEN 20 WHEN x.Age BETWEEN 30 AND 39 THEN 30 WHEN x.Age BETWEEN 40 AND 49 THEN 40 WHEN x.Age >= 50 THEN 50 ELSE NULL END ) y ) SELECT ag.AgeGroupID, TotalCount = COUNT(a.AgeDecade) FROM @AgeGroup ag LEFT JOIN cte_Users a ON a.AgeDecade = ag.AgeGroupID GROUP BY ag.AgeGroupID;
Choosing between the two is purely preference. There’s no performance gain to using a CTE here.
Bonus:
If you wanted to table drive your groups and also have 0 counts, you could do something like this…though I will warn you to be careful using APPLY operators because they can be tricky with performance sometimes.
IF OBJECT_ID('tempdb..#User','U') IS NOT NULL DROP TABLE #User; --SELECT * FROM #User WITH c1 AS (SELECT x.x FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) x(x)) -- 10 , c2(x) AS (SELECT 1 FROM c1 x CROSS JOIN c1 y) -- 10 * 10 SELECT UserID = IDENTITY(int,1,1) , DateOfBirth = CONVERT(date, GETDATE()-(RAND(CHECKSUM(NEWID()))*18500)) INTO #User FROM c2 u; IF OBJECT_ID('tempdb..#AgeRange','U') IS NOT NULL DROP TABLE #AgeRange; --SELECT * FROM #AgeRange CREATE TABLE #AgeRange ( AgeRangeID tinyint NOT NULL IDENTITY(1,1), RangeStart tinyint NOT NULL, RangeEnd tinyint NOT NULL, RangeLabel varchar(100) NOT NULL, ); INSERT INTO #AgeRange (RangeStart, RangeEnd, RangeLabel) VALUES ( 0, 29, '< 29') , (30, 39, '30 - 39') , (40, 49, '40 - 49') , (50, 255, '50+'); -- Using an OUTER APPLY SELECT ar.RangeLabel, COUNT(y.UserID) FROM #AgeRange ar OUTER APPLY ( SELECT u.UserID FROM #User u CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x WHERE x.Age BETWEEN ar.RangeStart AND ar.RangeEnd ) y GROUP BY ar.RangeLabel, ar.RangeStart ORDER BY ar.RangeStart; -- Using a CTE WITH cte_users AS ( SELECT u.UserID , Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000 FROM #User u ) SELECT ar.RangeLabel, COUNT(u.UserID) FROM #AgeRange ar LEFT JOIN cte_users u ON u.Age BETWEEN ar.RangeStart AND ar.RangeEnd GROUP BY ar.RangeStart, ar.RangeLabel ORDER BY ar.RangeStart;