Skip to content
Advertisement

In Hive, how to convert an array of string to an array of numeric numbers

I have a hive table that looks like the following:

id | value
 1 | ['0', '0', '1', '0', '1', '1', '0', '0']
 2 | ['2', '0', '3', '0', '3', '1', '2', '1']

I want the result to be the following:

id | value
 1 | [0,0,1,0,1,1,0,0]
 2 | [2,0,3,0,3,1,2,1]

I need to convert them into an array of float so that I can use them in ST_Constains(ST_MultiPolygon(), st_point()) to determine if a point is in an area.

I am new to Hive, not sure if that is possible, any help would be very appreciated.

Advertisement

Answer

You can explode array, cast value, collect array again. Demo:

with your_table as(
select stack(2,
 1 , array('0', '0', '1', '0', '1', '1', '0', '0'),
 2 , array('2', '0', '3', '0', '3', '1', '2', '1')
 ) as (id,value)
 ) --use your_table instead of this


 select s.id, 
        s.value                            as original_array, 
        collect_list(cast(s.str as float)) as array_float 
 from
(select t.*, s.* 
 from your_table t
               lateral view outer posexplode(t.value)s as pos,str       
   distribute by t.id, t.value 
         sort by s.pos --preserve order in the array
 )s  
group by s.id, s.value;  

Result:

OK
1       ["0","0","1","0","1","1","0","0"]       [0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0]
2       ["2","0","3","0","3","1","2","1"]       [2.0,0.0,3.0,0.0,3.0,1.0,2.0,1.0]

See also this answer about sorting array in the query https://stackoverflow.com/a/57392965/2700344

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement