I’m trying to perform subtraction among timestamps and would like to convert the timestamps in forms that can be converted to minutes.
I used regexp_replace to convert timestamp in such form:
2020-06-20T17:25:59:378Z
The following code will convert it to seconds
unix_timestamp(regexp_replace(value,'(.*?)T(.*?):([^:]*?)Z$','$1 $2\.$3'))
I have other two timestamps that I wish to convert to seconds, such as:
2020-03-19 15:45:33 03-19-2020 11:07:25:103
How should I convert these two timestamp to seconds by using regexp_replace() or any other functions?
Thank you!
Advertisement
Answer
First of all, unix_timestamp
returns seconds passed from unix epoch. It does ignore milliseconds. This is why if you want epoch time in seconds, you can extract only 'yyyy-MM-dd HH:mm:ss'
.
Second, If you have all these different formats in single dataset and you want to convert them all, you can use case statement to check pattern and convert accordingly:
with your_data as ( --This is your data example select stack(3, '2020-06-20T17:25:59:378Z', '2020-03-19 15:45:33', '03-19-2020 11:07:25:103' ) as str ) select case when str rlike '^(\d{4}-\d{2}-\d{2})[T ](\d{2}:\d{2}:\d{2})' --matches first two strings then unix_timestamp(regexp_replace(str,'^(\d{4}-\d{2}-\d{2})[T ](\d{2}:\d{2}:\d{2})','$1 $2')) when str rlike '^(\d{2})-(\d{2})-(\d{4})[T ](\d{2}:\d{2}:\d{2})' --matches third string, allows T or space after date then unix_timestamp(regexp_replace(str,'^(\d{2})-(\d{2})-(\d{4})[T ](\d{2}:\d{2}:\d{2})','$3-$1-$2 $4')) end result_unix_timestamp from your_data
Returns:
result_unix_timestamp 1592673959 1584632733 1584616045
You can add more patterns to the case with corresponding conversion and in such way convert all possible cases. And of course, not necessarily all cases should use regex_replace for conversion. Though regex allows to identify and parse most complex strings.
Also you can try to convert using one pattern and if it returns null
then try to convert using another pattern and so on:
coalesce(unix_timestamp(regexp_replace(str,'^(\d{4}-\d{2}-\d{2})[T ](\d{2}:\d{2}:\d{2})','$1 $2')), unix_timestamp(regexp_replace(str,'^(\d{2})-(\d{2})-(\d{4})[T ](\d{2}:\d{2}:\d{2})','$3-$1-$2 $4')) )