I have a table like so:
|uniqueID|scandatetime |scanfacilityname| +--------+-------------------+----------------+ |12345678|01-01-2020 13:45:12|BALTIMORE | |12345678|01-02-2020 22:45:12|BALTIMORE | |12345678|01-04-2020 10:15:12|PHILADELPHIA | |12345678|01-05-2020 08:45:12| |
And I would like to return an entire row containing the uniqueID, scandatetime, and latest scanfacilityname (i.e., max scandatetime where scanfacilityname is not null). I have tried the following query:
SELECT "uniqueID" , "max"(CAST("scandatetime" AS timestamp)) "timestamp" , COALESCE("scanfacilityname") "scanfacilityname" FROM iv_scans_new.scan_data WHERE (("partition_0" = '2020') AND ("partition_1" IN ('06', '07', '08'))) and scanfacilityname is not null group by 1, 3 ;
But im not sure if this is correct/if I need the coalesce.
Advertisement
Answer
You can use the max_by
function:
select max_by(uniqueID, scanfacilityname), max_by(scandatetime, scanfacilityname), max(scanfacilityname)
See the doc.
There’s no coalesce
needed since the max
and max_by
functions will effectively ignore null
values.