In PostgreSQL, if i have this kind of record :
|------------------|---------------------|------------------| | corona_country | corona_patient | corona_suspected | |------------------|---------------------|------------------| | Mexico | Teddy | 2020-03-01 | |------------------|---------------------|------------------| | Indonesia | Andy | 2020-03-01 | |------------------|---------------------|------------------| | Indonesia | Andy | 2020-03-05 | |------------------|---------------------|------------------| | China | Sarah | 2020-03-04 | |------------------|---------------------|------------------| | UK | Michael | 2020-03-05 | |------------------|---------------------|------------------| | Italy | Michael | 2020-03-15 | |------------------|---------------------|------------------| | Italy | Michael | 2020-03-15 | |------------------|---------------------|------------------|
I want to add extra column (by a Select query) that identifies if a combination of corona_country and corona_patient is having the latest date (of corona_suspected column) among its own records. So it’ll look like this :
|------------------|---------------------|------------------|-----------| | corona_country | corona_patient | corona_suspected | LATEST? | |------------------|---------------------|------------------|-----------| | Mexico | Teddy | 2020-03-01 | Y | |------------------|---------------------|------------------|-----------| | Indonesia | Andy | 2020-03-01 | N | |------------------|---------------------|------------------|-----------| | Indonesia | Andy | 2020-03-05 | Y | |------------------|---------------------|------------------|-----------| | China | Sarah | 2020-03-04 | Y | |------------------|---------------------|------------------|-----------| | UK | Michael | 2020-03-05 | Y | |------------------|---------------------|------------------|-----------| | Italy | Michael | 2020-03-13 | N | |------------------|---------------------|------------------|-----------| | Italy | Michael | 2020-03-15 | Y | |------------------|---------------------|------------------|-----------| | Italy | Michael | 2020-03-15 | Y | |------------------|---------------------|------------------|-----------|
How can i achieve this by sql query? Is there any method that joins another table that records latest dates of each combination of corona_country and corona_patient?
Advertisement
Answer
One option would be using MAX() OVER (PARTITION BY..) analytic function
SELECT *,
CASE
WHEN( MAX( corona_suspected ) OVER
( PARTITION BY corona_country, corona_patient ) = corona_suspected )
THEN
'Y'
ELSE
'N'
END AS latest
FROM t
OR alternatively
Use DENSE_RANK() analytic function
CASE
WHEN(DENSE_RANK() OVER
(PARTITION BY corona_country, corona_patient ORDER BY corona_suspected DESC )=1)
THEN
'Y'
ELSE
'N'
END AS latest
in order to return the result with ties(more than one latest values for corona_suspected columns for each groups)