Skip to content
Advertisement

Using Group By as part of a where clause

I’m trying to eliminate certain records from a dataset using SQL Server. The title of my post may be inaccurate, as a better solution may exist than what I have in mind.

In my query, I am selecting from Table A, and the rows that I want to end up with should meet the following criteria:

  1. All rows where A.ItemNumber = B.ItemNumber
  2. All rows where A.ItemNumber <> B.ItemNumber AND that row’s Task value does not have another row that meets criteria #1.

So for the below example:

  1. Gives us ItemNumber 102, 104, 106 rows.

  2. Gives us ItemNumber 105 row. 100, 101 are removed from dataset because their Task (1) is associated with Table B at ItemNumber 102. Same for 103 with Task (2) being associated at ItemNumber 104.

Table A
Task    ItemNumber
1          100
1          101
1          102
2          103
2          104
3          105
4          106

Table B
ItemNumber       Data
102              aaa
104              bbb
106              ccc

My initial thought was to load Table A into a temp table, LEFT JOIN with Table B, and DELETE FROM {temp table} WHERE (data IS NULL AND {insert some kind of grouping logic here}). But I have been completely unable to figure out a grouping logic that will work for the problem. I spent the weekend hoping a solution would come to me, but am now giving in and seeking advice.

Advertisement

Answer

With a CTE that meets the 1st condition and UNION ALL to return the rest of the rows:

with cte as (
  select a.*
  from TableA a
  where exists (select 1 from TableB where ItemNumber = a.ItemNumber)
)
select * from cte
union all
select a.* from TableA a
where not exists (select 1 from cte where Task = a.Task)
order by Task

See the demo.
Results:

Task    ItemNumber
1       102
2       104
3       105
4       106
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement