Sql query to flter by column only if duplicate exists

I’m writing a query on data with a ‘provider_id’ column. The data is mostly unique except for some entries with all columns but ‘provider_id’ being unique. Most of the values for provider_id are 1 but when there are duplicates as described above I want to remove the row with provider_id = 1. There are only ever 2 duplicates at a time.

The query below lets me find all the duplicates but I’m not sure how to then select the correct row based off this.

select position_key, pricing_provider
from customer_data
group by position_key, pricing_provider
having count(*) > 0

​x
 
select position_key, pricing_providerfrom customer_datagroup by position_key, pricing_providerhaving count(*) > 0​

How can I select out where pricing_provider != 1?

Ex:

 position_key | pricing_provider |account | user 
152894  1   2   5
152704  1   2   1084
152678  6   2   43
152513  6   2   38
152305  1   2   1121
152305  6   2   1121
152300  6   2   1121
152300  1   2   1121

 
 position_key | pricing_provider |account | user 152894  1   2   5152704  1   2   1084152678  6   2   43152513  6   2   38152305  1   2   1121152305  6   2   1121152300  6   2   1121152300  1   2   1121​

I want:

 position_key | pricing_provider |account | user 
152894  1   2   5
152704  1   2   1084
152678  6   2   43
152513  6   2   38
152305  6   2   1121
152300  6   2   1121

 
 position_key | pricing_provider |account | user 152894  1   2   5152704  1   2   1084152678  6   2   43152513  6   2   38152305  6   2   1121152300  6   2   1121​

Answer

If you care about duplicates, I would expect > 1, not > 0. Then, window functions are probably the best solution:

select position_key, pricing_provicer
from (select position_key, pricing_provider,
             count(*) filter (where pricing_provider <> 1) over (partition by position_key) as cnt_not_1
      from customer_data
      group by position_key, pricing_provider
      having count(*) > 1
     ) pp
where pricing_provider <> 1 or cnt_not_1 = 0;

 
select position_key, pricing_provicerfrom (select position_key, pricing_provider,             count(*) filter (where pricing_provider <> 1) over (partition by position_key) as cnt_not_1      from customer_data      group by position_key, pricing_provider      having count(*) > 1     ) ppwhere pricing_provider <> 1 or cnt_not_1 = 0;​

Advertisement

Answer