I have a hive table that looks like the following:
id | value 1 | ['0', '0', '1', '0', '1', '1', '0', '0'] 2 | ['2', '0', '3', '0', '3', '1', '2', '1']
I want the result to be the following:
id | value 1 | [0,0,1,0,1,1,0,0] 2 | [2,0,3,0,3,1,2,1]
I need to convert them into an array of float so that I can use them in ST_Constains(ST_MultiPolygon(), st_point())
to determine if a point is in an area.
I am new to Hive, not sure if that is possible, any help would be very appreciated.
Advertisement
Answer
You can explode array, cast value, collect array again. Demo:
with your_table as( select stack(2, 1 , array('0', '0', '1', '0', '1', '1', '0', '0'), 2 , array('2', '0', '3', '0', '3', '1', '2', '1') ) as (id,value) ) --use your_table instead of this select s.id, s.value as original_array, collect_list(cast(s.str as float)) as array_float from (select t.*, s.* from your_table t lateral view outer posexplode(t.value)s as pos,str distribute by t.id, t.value sort by s.pos --preserve order in the array )s group by s.id, s.value;
Result:
OK 1 ["0","0","1","0","1","1","0","0"] [0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0] 2 ["2","0","3","0","3","1","2","1"] [2.0,0.0,3.0,0.0,3.0,1.0,2.0,1.0]
See also this answer about sorting array in the query https://stackoverflow.com/a/57392965/2700344