I recently had a rather interesting problem to solve where I needed to construct a list consisting of a combined selection of one choice from each of an arbitrarily defined set of categories, and an aggregated total of a value associated with each choice. The choices within each category are also arbitrarily defined.
By “arbitrarily defined”, I mean that the user creates and maintains the list of categories and the list of choices for each category and the value for each choice.
Here’s a practical example using a condos concept I made up:
Categories Table ============================== Category Order Category -------------- ------------- 1 Model 2 View 3 Floor Group Category Adjustments Table ============================== Category Adjustment Desc. $/Sq. Ft. ------------ ----------------- ---------- Model 2 BR-1.5 BTH $100 Model 3 BR-2.0 BTH $120 Model 3 BR-2.5 BTH $130 Model 4 BR-3.0 BTH $150 View Ocean $100 View Garden $75 View City $50 View Pool $25 Floor Group Ground $50 Floor Group Middle Floors $25 Floor Group Penthouse $100
The resulting output of the SQL would contain 48 rows (4 models x 4 views x 3 floor groups) and would look somewhat like this:
SQL Output Results =========================================== Combined Choices Total $/Sq. Ft. -------------------------------- ---------------- 2 BR-1.5 BTH + Ocean + Ground $250 ...(46 other combinations)... 4 BR-3.0 BTH + Pool + Penthouse $275
I felt that this problem should be able to be solved using only a SQL statement without dynamically constructing SQL and without using complex logic built into a stored procedure or cursors or anything like that.
Looking at the problem itself, I realized that some form of recursion would likely be necessary, but I didn’t know where else to start. I explored table pivoting, dynamic SQL, using cursors, and all those options that my gut was telling me shouldn’t be necessary.
It turns out my instincts were right, and I eventually came up with a SQL only solution which I will post as an answer below. But because it was such a daunting problem (for me at least), I thought I’d post both the problem and solution on Stack Overflow. It also gives others the opportunity to post their own solution, or perhaps explore variations of the problem.
Advertisement
Answer
For Microsoft T-SQL, the solution comes from the use of a recursive common table expression and limiting the output to those results that used all of the available categories:
------------------------------------------------------------------------------------------ ---- DATA SAMPLE SETUP ----- ------------------------------------------------------------------------------------------ declare @data_categories table ( category_order int not null, category varchar(50) not null, primary key (category_order) ) declare @data_price_adjustments table ( category varchar(50) not null, adjustment_desc varchar(50) not null, price_per_sq_ft numeric(10,0), primary key (category, adjustment_desc) -- If this were a real table, the below would also exist. --constraint cfk_data_price_adjustments_category -- foreign key (category) -- references [@data_categories] (category) ) insert into @data_categories (category_order, category) values (1, 'Model'), (2, 'View'), (3, 'Floor Group') insert into @data_price_adjustments (category, adjustment_desc, price_per_sq_ft) values ('Model', '2 BR-1.5 BTH', 100), ('Model', '3 BR-2.0 BTH', 120), ('Model', '3 BR-2.5 BTH', 130), ('Model', '4 BR-3.0 BTH', 150), ('View', 'Ocean', 100), ('View', 'Garden', 75), ('View', 'City', 50), ('View', 'Pool', 25), ('Floor Group', 'Ground', 50), ('Floor Group', 'Middle Floors', 25), ('Floor Group', 'Penthouse', 100) ------------------------------------------------------------------------------------------ ---- PROBLEM SOLUTION ----- ------------------------------------------------------------------------------------------ ;with cte (combined_options, price_per_sq_ft, category_order, categories_used) as ( select convert(varchar(max), adj.adjustment_desc), convert(numeric(10,0), adj.price_per_sq_ft), cat.category_order, convert(int, 1) from @data_price_adjustments as adj join @data_categories as cat on cat.category = adj.category union all select convert(varchar(max), concat(convert(varchar(max), adj.adjustment_desc), ' + ', cte.combined_options) ), convert(numeric(10, 0), adj.price_per_sq_ft + cte.price_per_sq_ft ), cat.category_order, convert(int, 1 + cte.categories_used ) from @data_price_adjustments as adj join @data_categories as cat on cat.category = adj.category join cte on cat.category_order < cte.category_order ) SELECT combined_options, price_per_sq_ft --, category_order, categories_used FROM cte where categories_used = (select count(*) from @data_categories) order by combined_options