Skip to content
Advertisement

Hive Case Resulting Duplicate Row

I have a table containing contact numbers and another reference table containing a “length” variable and a number column. What I need is to find the prefix name where the prefix of the number matches the one in the reference table, but it should be the one that goes with the longest prefix matching. (omg I hope that make sense)

What I’ve tried so far:

select a.record_type,a.number,b.prefix,b.prefix_name 
from first_table a , second_table b 
where  a.transaction_date=20180924 and case  
    when b.length=1 then substr(a.number,1,1)=b.prefix  
    when b.length=2 then substr(a.number,1,2)=b.prefix  
    when b.length=3 then substr(a.number,1,3)=b.prefix  
    when b.length=4 then substr(a.number,1,4)=b.prefix  
    when b.length=5 then substr(a.number,1,5)=b.prefix  
    when b.length=6 then substr(a.number,1,6)=b.prefix  
    when b.length=7 then substr(a.number,1,7)=b.prefix  
    when b.length=8 then substr(a.number,1,8)=b.prefix 
    when b.length=9 then substr(a.number,1,9)=b.prefix 
    when b.length=10 then substr(a.number,1,10)=b.prefix 
    when b.length=11 then substr(a.number,1,11)=b.prefix 
    when b.length=12 then substr(a.number,1,12)=b.prefix 
    when b.length=13 then substr(a.number,1,13)=b.prefix 
    when b.length=14 then substr(a.number,1,14)=b.prefix 
end

However it still returns duplicate result, i.e: if the number is 12345, it matches the reference with prefix 1234 and 123, while I actually just want the 1234 one.

Is there any way to like prioritize the case? Thanks

An example of the data in the two tables: example

My current result and the desired result: results

Advertisement

Answer

Ok I reworked it, try this:

    WITH FIRST_TABLE (RECORD_TYPE,NUM,TRANSACTION_DATE)AS (
    SELECT 'a',12345, DATE '2018-09-24' FROM DUAL
    ),
    SECOND_TABLE (PREFIX,PREFIX_NAME,LENGTH) AS(
    SELECT 12,'Type A', 2 FROM DUAL union all
    SELECT 1234,'Type B', 4 FROM DUAL 
    )
    select * from (
    SELECT A.RECORD_TYPE,A.NUM,B.PREFIX,B.PREFIX_NAME, MAX(B.PREFIX) OVER (PARTITION BY A.RECORD_TYPE,A.NUM) maxPrefix
    FROM FIRST_TABLE A ,SECOND_TABLE B
    WHERE  A.TRANSACTION_DATE=DATE '2018-09-24' 
    AND A.NUM LIKE (B.PREFIX||'%')
    )
    where PREFIX=maxPrefix;
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement