I have a db where I can have multiple rows with same field “aid” (it’s a sort of historical changelog of same items). With the following query, I’m extracting rows with same aid orderd by descending date:
select aid,data_ril,specie,id from scu.censimento t1 where (select count(*) from scu.censimento t2 where t1.aid = t2.aid) > 1 order by aid, data_ril desc
That should be a child table related to a father layer (made by a postgres materialized view that shows only newest aid records) and the following is the result I get:
+------+------------+--------+------+ | aid | data_ril | specie | id | +------+------------+--------+------+ | 349 | 2020-06-18 | 35 | 349 | +------+------------+--------+------+ | 349 | 2020-06-17 | 35 | 2004 | +------+------------+--------+------+ | 700 | 2020-08-07 | 58 | 700 | +------+------------+--------+------+ | 700 | 2020-07-06 | 58 | 2006 | +------+------------+--------+------+ | 700 | 2020-05-02 | 15 | 1956 | +------+------------+--------+------+ | 1316 | 2020-09-02 | 1 | 1316 | +------+------------+--------+------+ | 1316 | 2020-08-27 | 1 | 2005 | +------+------------+--------+------+ | 1317 | 2020-09-02 | 2 | 1317 | +------+------------+--------+------+ | 1317 | 2020-08-27 | 2 | 1996 | +------+------------+--------+------+ | 1481 | 2020-12-03 | 21 | 2112 | +------+------------+--------+------+ | 1481 | 2020-09-08 | 49 | 1481 | +------+------------+--------+------+ | 1492 | 2020-09-28 | 6 | 1492 | +------+------------+--------+------+ | 1492 | 2020-09-08 | 6 | 1999 | +------+------------+--------+------+ | 1688 | 2020-11-03 | 72 | 1688 | +------+------------+--------+------+ | 1688 | 2020-10-08 | 72 | 2000 | +------+------------+--------+------+
I’d like to know the SQL syntax to modify the above query in order to show all duplicate rows except the ones with latest dates, so that I can have a table like the following:
+------+------------+--------+------+ | aid | data_ril | specie | id | +------+------------+--------+------+ | 349 | 2020-06-17 | 35 | 2004 | +------+------------+--------+------+ | 700 | 2020-07-06 | 58 | 2006 | +------+------------+--------+------+ | 700 | 2020-05-02 | 15 | 1956 | +------+------------+--------+------+ | 1316 | 2020-08-27 | 1 | 2005 | +------+------------+--------+------+ | 1317 | 2020-08-27 | 2 | 1996 | +------+------------+--------+------+ | 1481 | 2020-09-08 | 49 | 1481 | +------+------------+--------+------+ | 1492 | 2020-09-08 | 6 | 1999 | +------+------------+--------+------+ | 1688 | 2020-10-08 | 72 | 2000 | +------+------------+--------+------+
Thanks in advance.
Advertisement
Answer
You could do this with window functions. The idea is to rank records having the same aid
by descending data_ril
, and then filter out the top record per group.
select aid, data_ril, specie, id from ( select t.*, row_number() over(partition by aid order by data_ril desc) rn from mytable t ) t where rn > 1 order by aid, data_ril