Split column in hive

Question

I am new to Hive and Hadoop framework. I am trying to write a hive query to split the column delimited by a pipe '|' character. Then I want to group up the 2 adjacent values and separate them into ...

Accepted Answer

I would suggest you to split your pairs split(mapper, '(?<=\d)\|(?=\w)'), e.g.split('c|0.2|d|0.3|e|0.6', '(?<=\d)\|(?=\w)')results in["c|0.2","d|0.3","e|0.6"]then explode the resulting array and split by |.Update:If you have digits as well and your float numbers have only one digit after decimal marker then the regex should be extended to split(mapper, '(?<=\.\d)\|(?=\w|\d)').Update 2:OK, the best way is to split on the second | as followssplit(mapper, '(?<!\G[^\|]+)\|')e.g.split('6193439|0.0444035224643987|6186654|0.0444035224643987', '(?<!\G[^\|]+)\|')results in["6193439|0.0444035224643987","6186654|0.0444035224643987"]

Split column in hive

Expected:

Actual

Advertisement

Answer