I have a table which has the following columns: Date, AcctId, CustID, BackupID, Logins, Amount. For a given Date, an AcctID can more than one CustIDs, but I want to assign a singular CustID to an AcctID based on the value of Logins.
i.e.:
If Cust1 has more Logins than Cust2 for the given Date and AcctID then CustID should have Cust1 as value.
If Cust1 logins = Cust2 logins then CustID should have BackupID value.
I then want to add the Amount for the given Date, AcctID.
Can anybody help with logic? I tried self join but probably not thinking it the right way.
Advertisement
Answer
You can implement your rules using aggregation and window functions:
select acctid, date,
(case when count(*) = 1 then min(custid)
else min(backupid)
end) as imputed_customer
from (select acctid, date, min(backupid) as backupid, custid,
count(*) as cnt,
rank() over (partition by acctid, date order by count(*) desc) as seqnum
from t
group by acctid, date
) ad
where seqnum = 1
group by acctid, date;
The subquery aggregates the data by acctid, date, and custid. It also ranks the custids for a given acct/date combination.
The outer query then says:
- If there is only one maximum
custid, then use that. - Otherwise use the
backupid.
Note that if there are multiple backupid on a given acctid/date combination, then an arbitrary one is used.