I am using case as an if else statement in my hive database, it seems it is not showing what i am trying to extract as.
i am using this script
case when substr(file_name, length(file_name) - 44 , 1) like '%_' then substr(file_name, length(file_name) - 44 , 9) else substr(file_name, length(file_name) - 45 , 9) END
in logical concept if at character 44 starts with _ , substring at 45 , else substring at 45
e.g (table look like )
it is extracting the data from column file name which is row_1| abcdefghijklmn_20171120_ row_1| abcdefghijklmn_20181130_ row_1| abcdefghijklmno_20171120_ this is what it returns 20171120 20181130 _20190721 it should return as this without _ character 20171120 20181130 20190721
am i using like statement correctly ?
Advertisement
Answer
One method is to replace the value:
replace(case when substr(file_name, length(file_name) - 44 , 1) like '%_'
           then substr(file_name, length(file_name) - 44 , 9)
           else substr(file_name, length(file_name) - 45 , 9) END
        end, '_', '')
However, I think regexp_extract() might be simpler:
regexp_extract(file_name, '^.{44}_?([0-9]{8})', '1')
The pattern may be simpler, if there is only one sequence of 8 digits surrounded by underscores:
regexp_extract(file_name, '_([0-9]{8})_', '1')