I have two tables: Budget Line
and Expense
. They are structured in such a way that an expense must have either a parent record in the budget line table, or a parent record in the expense table. I need to select all child-most expenses for each budget line.
For example – BudgetLine
:
Id | Description |
---|---|
1 | TEST123 |
2 | OTHERTEST |
Expense
:
Id | ParentId | ParentType | Description |
---|---|---|---|
1 | 1 | BudgetLine | Group of Expenses |
2 | 1 | Expense | Expense # 1 |
3 | 1 | Expense | Expense # 2 |
4 | 2 | BudgetLine | Expense 3 |
Desired result:
BudgetLineId | ExpenseId | Description |
---|---|---|
1 | 2 | Expense # 1 |
1 | 3 | Expense # 2 |
2 | 4 | Expense # 3 |
I am looking to omit expenses in the result only if they are the only sub-child. Note that an expense may have many children, grandchildren, etc.
I have tried the following, and researching various recursive CTE methods:
WITH RCTE AS ( SELECT Expense.Id, Expense.ParentId, Expense.ParentType, 1 AS Lvl, Expense.Id as startId FROM Expense UNION ALL SELECT rh.Id, rh.ParentId, rh.ParentType, Lvl+1 AS Lvl, rc.Id as startId FROM dbo.Expense rh INNER JOIN RCTE rc ON rh.Id = rc.ParentId and rc.ParentType = 'Expense' ), FilteredRCTE AS ( SELECT startId, MAX(LVL) AS Lvl FROM RCTE GROUP BY startID ), RecursiveData AS ( SELECT FilteredRCTE.startId AS ExpenseId, RCTE.ParentId AS BudgetLineId FROM FilteredRCTE JOIN RCTE ON FilteredRCTE.startId = RCTE.startId AND FilteredRCTE.Lvl = RCTE.Lvl ) SELECT * FROM RecursiveData
Which did in-fact obtain all the child Expenses and their associated parent BudgetLine, but it also included the middle-tier expenses (such as item 1 in the example) and I cannot figure out how to filter those middle-tier items out.
Here is a script to create tables / insert sample data:
CREATE TABLE [dbo].[BudgetLine] ( [Id] [int] IDENTITY(1,1) NOT NULL, [Description] [varchar](500) NULL, ) ON [PRIMARY] GO INSERT INTO dbo.BudgetLine VALUES ('TEST123') INSERT INTO dbo.BudgetLine VALUES ('OTHERTEST') CREATE TABLE [dbo].[Expense] ( [Id] [int] IDENTITY(1,1) NOT NULL, [ParentId] [int] NOT NULL, [ParentType] [varchar](100) NOT NULL, [Description] [varchar](max) NULL, ) ON [PRIMARY] GO INSERT INTO dbo.Expense VALUES ('1', 'BudgetLine', 'Group of Expenses') INSERT INTO dbo.Expense VALUES ('1', 'Expense', 'Expense # 1') INSERT INTO dbo.Expense VALUES ('1', 'Expense', 'Expense # 2') INSERT INTO dbo.Expense VALUES ('2', 'BudgetLine', 'Expense # 3')
Advertisement
Answer
Maybe I have oversimplified, but the following returns your desired results, by checking that there is no other expense row connected to the current row.
WITH RCTE AS ( SELECT E.Id ExpenseId, E.ParentId, E.ParentType FROM #Expense E UNION ALL SELECT RH.Id, RH.ParentId, RH.ParentType FROM #Expense RH INNER JOIN RCTE RC ON RH.Id = RC.ParentId AND RC.ParentType = 'Expense' ) SELECT * FROM RCTE R1 WHERE NOT EXISTS ( SELECT 1 FROM RCTE R2 WHERE R2.ParentId = R1.ExpenseId AND R2.ParentType = 'Expense' );