I have an issue with using SUM() OVER(PARTITION BY) where I have duplicate records. I can remove the duplicates using DISTINCT but the SUM still performs over all duplicate records.
For instance my data is:
-------------------------------------- |ref |CODES |VALUE |SUM | -------------------------------------- |101 |CODE1 |20 |150 | |101 |CODE2 |30 |150 | |101 |CODE1 |20 |150 | |101 |CODE2 |30 |150 | |101 |CODE1 |20 |150 | |101 |CODE2 |30 |150 |
With DISTINCT applied it becomes:
-------------------------------------- |ref |CODES |VALUE |SUM | -------------------------------------- |101 |CODE1 |20 |150 | |101 |CODE2 |30 |150 |
So the issue is I would hope the SUM column to be 50.
The SUM code is:
SUM(value) OVER (PARTITION BY ref) AS Total
I would like to see:
-------------------------------------- |ref |CODES |VALUE |SUM | -------------------------------------- |101 |CODE1 |20 |50 | |101 |CODE2 |30 |50 |
Any advice is appreciated. Thanks.
Advertisement
Answer
You could first select distinct
in a subquery, then do the window sum:
select ref, codes, value, sum(value) over(partition by ref) total from (select distinct ref, codes, value from mytable) t
This would produce a resultset with 2 records and a total
of 50
.