Skip to content
Advertisement

How to handle duplicates created by LEFT JOIN

LEFT TABLE:

+------+---------+--------+
| Name | Surname | Salary |
+------+---------+--------+
| Foo  | Bar     |    100 |
| Foo  | Kar     |    300 |
| Fo   | Ba      |     35 |
+------+---------+--------+

RIGHT TABLE:

+------+-------+
| Name | Bonus |
+------+-------+
| Foo  |    10 |
| Foo  |    20 |
| Foo  |    50 |
| Fo   |    10 |
| Fo   |   100 |
| F    |  1000 |
+------+-------+

DESIRED OUTPUT:

+------+---------+--------+-------+
| Name | Surname | Salary | Bonus |
+------+---------+--------+-------+
| Foo  | Bar     |    100 |    80 |
| Foo  | Kar     |    300 |     0 |
| Fo   | Ba      |     35 |   110 |
+------+---------+--------+-------+

The closest I get is this:

SELECT 
    a.Name,
    Surname,
    sum(Salary),
    sum(Bonus)
FROM (SELECT 
        Name,
        Surname,
        sum(Salary) as Salary
      FROM input
      GROUP BY 1,2) a LEFT JOIN (SELECT Name,
                                        SUM(Bonus) as Bonus
                                 FROM input2
                                 GROUP BY 1) b 
ON a.Name = b.Name
GROUP BY 1,2;

Which gives:

+------+---------+-------------+------------+
| Name | Surname | sum(Salary) | sum(Bonus) |
+------+---------+-------------+------------+
| Fo   | Ba      |          35 |        110 |
| Foo  | Bar     |         100 |         80 |
| Foo  | Kar     |         300 |         80 |
+------+---------+-------------+------------+

I can’t figure out how to get rid of Bonus duplication. Ideal solution for me would be as specified in the ‘DESIRED OUTPUT’, which is adding Bonus to only one Name and for other records with the same Name adding 0.

Advertisement

Answer

Try a Row_number over the Name category partioned by Name. This will give you different numbers for your duplicates. You can then search for the case when this number is 1 and return the result you want. Else return 0. The code can look something like this.

SELECT 
    a.Name,
    Surname,
    sum(Salary),
    Case    when Duplicate_Order = 1
            then bonus
            else 0
            end as 'Bonus'
FROM (SELECT 
        Name,
        Surname,
        sum(Salary) as Salary
        ,ROW_NUMBER() over (partition by Name order by name) as [Duplicate_Order]
      FROM input
      GROUP BY 1,2) a 
      LEFT JOIN (SELECT Name,
                    SUM(Bonus) as Bonus
                FROM input2
                GROUP BY 1) b 
ON a.Name = b.Name
GROUP BY 1,2;

Hope that helps!

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement