Skip to content
Advertisement

Using Case WHEN to extract with first character

I am using case as an if else statement in my hive database, it seems it is not showing what i am trying to extract as.

i am using this script

case when substr(file_name, length(file_name) - 44 , 1) like '%_' then substr(file_name, length(file_name) - 44 , 9) else substr(file_name, length(file_name) - 45 , 9) END

in logical concept if at character 44 starts with _ , substring at 45 , else substring at 45

e.g (table look like )

it is extracting the data from column file name which is 
row_1| abcdefghijklmn_20171120_
row_1| abcdefghijklmn_20181130_
row_1| abcdefghijklmno_20171120_

this is what it returns 
20171120
20181130
_20190721

it should return as this without _ character
20171120
20181130
20190721

am i using like statement correctly ?

Advertisement

Answer

One method is to replace the value:

replace(case when substr(file_name, length(file_name) - 44 , 1) like '%_'
           then substr(file_name, length(file_name) - 44 , 9)
           else substr(file_name, length(file_name) - 45 , 9) END
        end, '_', '')

However, I think regexp_extract() might be simpler:

regexp_extract(file_name, '^.{44}_?([0-9]{8})', '1')

The pattern may be simpler, if there is only one sequence of 8 digits surrounded by underscores:

regexp_extract(file_name, '_([0-9]{8})_', '1')
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement