My data is too large to analyze since it is collecting every second or so. To reduce the data, I will like to group into interval of 5 minutes. I tried converting into unix timestamp and reverting it back, but didn’t work.
I tried something like this
x
SELECT operation,
value,
from_unixtime(unix_timestamp(update_time)/300) as interval_5
FORM storage
Original data or output
operation value update_time
---------- ------- --------------------
sensor 190 2021-03-18 22:37:04
sensor 191 2021-03-18 22:37:05
sensor 190 2021-03-18 22:37:06
sensor 192 2021-03-18 22:37:07
sensor 193 2021-03-18 22:37:08
Desired output
operation value update_time
---------- ------- --------------------
sensor 190 2021-03-18 22:37:04
sensor 195 2021-03-18 22:42:04
sensor 197 2021-03-18 22:47:04
sensor 194 2021-03-18 22:52:04
sensor 190 2021-03-18 22:57:04
Advertisement
Answer
I don’t know Hive, but make sure that the division is an integr division truncating fractions, and then multiply back, losing the last fractional 300 seconds in the process:
from_unixtime(unix_timestamp(update_time)/300*300) as interval_5
If Hive Sql does not truncate (upon integer division by 300) you have to use a function like FLOOR to do so.