Joining table only once in SQL

I’m sure this is very basic, but I’ve hit a brick wall. First post here, long time reader.

I’ve got two tables, one is a list of customers, one is a list of their activity by month. For some unknown reason, some of the data in the activity column is duplicated and it’s causing me errors when calculating totals. Is there a way for me to only join the data in the activity table once?

E.g the customer table:

CustID	AccountNo
331	AccountNo1
332	AccountNo2
333	AccountNo2
334	AccountNo2

The activity table looks like this:

CustID	ActivityID	Month	Sales
331	331133	Jan-21	£30
332	331101	Jan-21	£10
332	331121	Feb-21	£40
332	331196	Feb-21	£40
332	331141	Mar-21	£40
333	331120	Feb-21	£20
334	331119	Mar-21	£20

You will see there are two rows for customer 332 for Feb-21 with the same sales amount. Total sales for customer 332 are £90 not £130. How do I tell the JOIN to ignore multiple rows for a particular month if duplicated?

At the moment I’m doing a basic:

JOIN Activity AC (NOLOCK) ON C.CUSTID = AC.CUSTID

Obviously moving forward the activity table needs fixing but I’m hoping for a quick win!

Sorry if this is basic! Mark

Answer

One method is:

select c.*, a.*
from customer c left join
     (select a.*,
             row_number() over (partition by custid, month order by amount) as seqnum
      from activity a
     ) a
     on c.custid = a.custid and c.seqnum = 1;

​x
 
select c.*, a.*from customer c left join     (select a.*,             row_number() over (partition by custid, month order by amount) as seqnum      from activity a     ) a     on c.custid = a.custid and c.seqnum = 1;​

It is unclear if the amounts need to be the same for the duplication. If so, include amount in the partition by.

Advertisement

Answer