I have a table which has the following columns: Date
, AcctId
, CustID
, BackupID
, Logins
, Amount
. For a given Date, an AcctID
can more than one CustIDs
, but I want to assign a singular CustID
to an AcctID
based on the value of Logins
.
i.e.:
If Cust1
has more Logins
than Cust2
for the given Date and AcctID
then CustID
should have Cust1
as value.
If Cust1
logins = Cust2
logins then CustID
should have BackupID
value.
I then want to add the Amount
for the given Date
, AcctID
.
Can anybody help with logic? I tried self join but probably not thinking it the right way.
Advertisement
Answer
You can implement your rules using aggregation and window functions:
select acctid, date, (case when count(*) = 1 then min(custid) else min(backupid) end) as imputed_customer from (select acctid, date, min(backupid) as backupid, custid, count(*) as cnt, rank() over (partition by acctid, date order by count(*) desc) as seqnum from t group by acctid, date ) ad where seqnum = 1 group by acctid, date;
The subquery aggregates the data by acctid
, date
, and custid
. It also ranks the custids for a given acct/date combination.
The outer query then says:
- If there is only one maximum
custid
, then use that. - Otherwise use the
backupid
.
Note that if there are multiple backupid
on a given acctid
/date
combination, then an arbitrary one is used.