DB schema
CREATE TABLE newsletter_status ( cryptid varchar(255) NOT NULL, status varchar(25), regDat timestamp, confirmDat timestamp, updateDat timestamp, deleteDat timestamp );
There are rows with the same cryptid
, I need to squash them to one row. So the cryptid
becomes effectively unique. The complexity comes from the fact that I need to compare dates by rows as well as by columns. How to implement that?
The rule I need to use is:
- status should be taken from the row with the latest timestamp (among all 4 dates)
- for every date column I need to select the latest date
Example:
002bc5 | new | 2010.01.15 | 2001.01.15 | NULL | 2020.01.10 002bc5 | confirmed | NULL | 2020.01.30 | 2020.01.15 | 2020.01.15 002bc5 | deactivated | NULL | NULL | NULL | 2020.12.03
needs to be squashed into:
002bc5 | deactivated | 2010.01.15 | 2020.01.30 | 2020.01.15 | 2020.12.03
The status deactivated
is taken because the timestamp 2020.12.03 is the latest
Advertisement
Answer
What you need to get the status is to sort rowset by dates in descending order. In Oracle there is agg_func(<arg>) keep (dense_rank first ...)
, in other databases it can be replaced with row_number()
and filter. Because analytic functions in HANA works not so good sometimes, I suggest to use the only one aggregate function I know in HANA that supports ordering inside – STRING_AGG
– with little trick. If you have not a thousands of rows with statuses (i.e. concatenated status will not be greater 4000 for varchar), it will work. This is the query:
select cryptid, max(regDat) as regDat, max(confirmDat) as confirmDat, max(updateDat) as updateDat, max(deleteDat) as deleteDat, substr_before( string_agg(status, '|' order by greatest( ifnull(regDat, date '1000-01-01'), ifnull(confirmDat, date '1000-01-01'), ifnull(updateDat, date '1000-01-01'), ifnull(deleteDat, date '1000-01-01') ) desc), '|' ) as status from newsletter_status group by cryptid