Skip to content
Advertisement

Grouping the result set based on conditions

I am calculating Age of a user based on his date of birth.

select UserId, (Convert(int,Convert(char(8),GETDATE(),112))-Convert(char(8),[DateOfBirth],112))/10000 AS [Age] FROM dbo.[User]

This gives me the UserId and his age.

Now I want to group this result.

How many users are in 30’s, How many users in 40’s and how many users in their 50’s.. need the count of users with their age groups

If the user is > 0 and less than 30 he should be grouped to 20’s

If the user is >= 30 and < 40 then he should be added to 30’s list, same with 40’s and 50’s

Can this be achieved without creating any temp table?

Advertisement

Answer

I believe this will get you what you want.

Anything < 30 will be placed in the ’20’ group. Anything >= 50 will be placed in the ’50’ group.

If they are 30-39 or 40-49, they will be placed in their appropriate decade group.

SELECT y.AgeDecade, COUNT(*)
FROM dbo.[User] u
    CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x
    CROSS APPLY (SELECT AgeDecade = CASE
                                        WHEN x.Age <= 29 THEN 20
                                        WHEN x.Age BETWEEN 30 AND 39 THEN 30
                                        WHEN x.Age BETWEEN 40 AND 49 THEN 40
                                        WHEN x.Age >= 50 THEN 50
                                        ELSE NULL
                                    END
    ) y
GROUP BY y.AgeDecade

Placing the logic into CROSS APPLY makes it easier to reuse the logic within the same query, this way you can use it in SELECT, GROUP BY, ORDER BY, WHERE, etc, without having to duplicate it. This could also be done using a cte, but in this scenario, this is my preference.


Update:

You asked in the comments how it would be possible to show a count of 0 when no people exist for an age group. In most cases the answer is simple, LEFT JOIN. As with everything, there’s always more than one way to bake a cake.

Here are a couple ways you can do it:

The simple left join, take the query from my original answer, and just do a left join to a table. You could do this in a couple ways, CTE, temp table, table variable, sub-query, etc. The takeaway is, you need to isolate your User table somehow.

Simple Sub-query method, nothing fancy. Just stuck the whole query into a sub-query, then left joined it to our new lookup table.

DECLARE @AgeGroup TABLE (AgeGroupID tinyint NOT NULL);
INSERT INTO @AgeGroup (AgeGroupID) VALUES (20),(30),(40),(50);

SELECT ag.AgeGroupID, TotalCount = COUNT(a.AgeDecade)
FROM @AgeGroup ag
    LEFT JOIN (
        SELECT y.AgeDecade
        FROM #User u
            CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x
            CROSS APPLY (SELECT AgeDecade = CASE
                                                WHEN x.Age <= 29 THEN 20
                                                WHEN x.Age BETWEEN 30 AND 39 THEN 30
                                                WHEN x.Age BETWEEN 40 AND 49 THEN 40
                                                WHEN x.Age >= 50 THEN 50
                                                ELSE NULL
                                            END
            ) y
    ) a ON a.AgeDecade = ag.AgeGroupID
GROUP BY ag.AgeGroupID;

This would be the exact same thing as using a cte:

DECLARE @AgeGroup TABLE (AgeGroupID tinyint NOT NULL);
INSERT INTO @AgeGroup (AgeGroupID) VALUES (20),(30),(40),(50);

WITH cte_Users AS (
    SELECT y.AgeDecade
    FROM #User u
        CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x
        CROSS APPLY (SELECT AgeDecade = CASE
                                            WHEN x.Age <= 29 THEN 20
                                            WHEN x.Age BETWEEN 30 AND 39 THEN 30
                                            WHEN x.Age BETWEEN 40 AND 49 THEN 40
                                            WHEN x.Age >= 50 THEN 50
                                            ELSE NULL
                                        END
        ) y
)
SELECT ag.AgeGroupID, TotalCount = COUNT(a.AgeDecade)
FROM @AgeGroup ag
    LEFT JOIN cte_Users a ON a.AgeDecade = ag.AgeGroupID
GROUP BY ag.AgeGroupID;

Choosing between the two is purely preference. There’s no performance gain to using a CTE here.


Bonus:

If you wanted to table drive your groups and also have 0 counts, you could do something like this…though I will warn you to be careful using APPLY operators because they can be tricky with performance sometimes.

IF OBJECT_ID('tempdb..#User','U') IS NOT NULL DROP TABLE #User; --SELECT * FROM #User
WITH c1 AS (SELECT x.x FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) x(x))       -- 10
    , c2(x) AS (SELECT 1 FROM c1 x CROSS JOIN c1 y)                                     -- 10 * 10
SELECT UserID = IDENTITY(int,1,1)
    , DateOfBirth = CONVERT(date, GETDATE()-(RAND(CHECKSUM(NEWID()))*18500))
INTO #User
FROM c2 u;

IF OBJECT_ID('tempdb..#AgeRange','U') IS NOT NULL DROP TABLE #AgeRange; --SELECT * FROM #AgeRange
CREATE TABLE #AgeRange (
    AgeRangeID  tinyint         NOT NULL IDENTITY(1,1),
    RangeStart  tinyint         NOT NULL,
    RangeEnd    tinyint         NOT NULL,
    RangeLabel  varchar(100)    NOT NULL,
);

INSERT INTO #AgeRange (RangeStart, RangeEnd, RangeLabel)
VALUES ( 0,  29, '< 29')
    ,  (30,  39, '30 - 39')
    ,  (40,  49, '40 - 49')
    ,  (50, 255, '50+');

-- Using an OUTER APPLY
SELECT ar.RangeLabel, COUNT(y.UserID)
FROM #AgeRange ar
    OUTER APPLY (
        SELECT u.UserID
        FROM #User u
            CROSS APPLY (SELECT Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000) x
        WHERE x.Age BETWEEN ar.RangeStart AND ar.RangeEnd
    ) y
GROUP BY ar.RangeLabel, ar.RangeStart
ORDER BY ar.RangeStart;

-- Using a CTE
WITH cte_users AS (
    SELECT u.UserID
        , Age = (CONVERT(int, CONVERT(char(8), GETDATE(), 112)) - CONVERT(int, CONVERT(char(8), u.DateOfBirth, 112))) / 10000
    FROM #User u
)
SELECT ar.RangeLabel, COUNT(u.UserID)
FROM #AgeRange ar
    LEFT JOIN cte_users u ON u.Age BETWEEN ar.RangeStart AND ar.RangeEnd
GROUP BY ar.RangeStart, ar.RangeLabel
ORDER BY ar.RangeStart;
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement