Skip to content
Advertisement

SQL All Categorized Possible Combinations

I recently had a rather interesting problem to solve where I needed to construct a list consisting of a combined selection of one choice from each of an arbitrarily defined set of categories, and an aggregated total of a value associated with each choice. The choices within each category are also arbitrarily defined.

By “arbitrarily defined”, I mean that the user creates and maintains the list of categories and the list of choices for each category and the value for each choice.

Here’s a practical example using a condos concept I made up:

Categories Table
==============================
Category Order   Category
--------------   -------------
1                Model
2                View
3                Floor Group

Category Adjustments Table
==============================
Category       Adjustment Desc.    $/Sq. Ft.
------------   -----------------   ----------
Model          2 BR-1.5 BTH        $100
Model          3 BR-2.0 BTH        $120
Model          3 BR-2.5 BTH        $130
Model          4 BR-3.0 BTH        $150
View           Ocean               $100
View           Garden              $75
View           City                $50
View           Pool                $25
Floor Group    Ground              $50
Floor Group    Middle Floors       $25
Floor Group    Penthouse           $100

The resulting output of the SQL would contain 48 rows (4 models x 4 views x 3 floor groups) and would look somewhat like this:

SQL Output Results
===========================================
Combined Choices                   Total $/Sq. Ft.
--------------------------------   ----------------
2 BR-1.5 BTH + Ocean + Ground      $250
...(46 other combinations)...
4 BR-3.0 BTH + Pool + Penthouse    $275

I felt that this problem should be able to be solved using only a SQL statement without dynamically constructing SQL and without using complex logic built into a stored procedure or cursors or anything like that.

Looking at the problem itself, I realized that some form of recursion would likely be necessary, but I didn’t know where else to start. I explored table pivoting, dynamic SQL, using cursors, and all those options that my gut was telling me shouldn’t be necessary.

It turns out my instincts were right, and I eventually came up with a SQL only solution which I will post as an answer below. But because it was such a daunting problem (for me at least), I thought I’d post both the problem and solution on Stack Overflow. It also gives others the opportunity to post their own solution, or perhaps explore variations of the problem.

Advertisement

Answer

For Microsoft T-SQL, the solution comes from the use of a recursive common table expression and limiting the output to those results that used all of the available categories:

------------------------------------------------------------------------------------------
---- DATA SAMPLE SETUP -----
------------------------------------------------------------------------------------------
declare @data_categories table
(
    category_order int not null,
    category varchar(50) not null,
    primary key (category_order)
)

declare @data_price_adjustments table
(
    category varchar(50) not null,
    adjustment_desc varchar(50) not null,
    price_per_sq_ft numeric(10,0),
    primary key (category, adjustment_desc)
    -- If this were a real table, the below would also exist.
    --constraint cfk_data_price_adjustments_category
    --  foreign key (category) 
    --  references [@data_categories] (category)
)

insert into @data_categories (category_order, category)
values
(1, 'Model'),
(2, 'View'),
(3, 'Floor Group')

insert into @data_price_adjustments
(category, adjustment_desc, price_per_sq_ft)
values
('Model', '2 BR-1.5 BTH', 100),
('Model', '3 BR-2.0 BTH', 120),
('Model', '3 BR-2.5 BTH', 130),
('Model', '4 BR-3.0 BTH', 150),
('View', 'Ocean', 100),
('View', 'Garden', 75),
('View', 'City', 50),
('View', 'Pool', 25),
('Floor Group', 'Ground', 50),
('Floor Group', 'Middle Floors', 25),
('Floor Group', 'Penthouse', 100)


------------------------------------------------------------------------------------------
---- PROBLEM SOLUTION -----
------------------------------------------------------------------------------------------
;with cte (combined_options, price_per_sq_ft, category_order, categories_used) 
as 
(
    select 
        convert(varchar(max), adj.adjustment_desc), 
        convert(numeric(10,0), adj.price_per_sq_ft), 
        cat.category_order,
        convert(int, 1)
    from @data_price_adjustments as adj
    join @data_categories as cat on cat.category = adj.category
    union all
    select 
        convert(varchar(max), 
            concat(convert(varchar(max), adj.adjustment_desc), ' + ', cte.combined_options)
        ),
        convert(numeric(10, 0), 
            adj.price_per_sq_ft + cte.price_per_sq_ft
        ),
        cat.category_order,
        convert(int, 
            1 + cte.categories_used
        )
    from @data_price_adjustments as adj
    join @data_categories as cat on cat.category = adj.category
    join cte on 
        cat.category_order < cte.category_order
)
SELECT 
    combined_options, price_per_sq_ft --, category_order, categories_used
FROM cte
where categories_used = (select count(*) from @data_categories)
order by combined_options
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement