I have a two tables, main_table & staging_table, main_table contains original data whereas staging_table contains the few of the updated records that I have to add into with main_table data, and for that I am using unique id –PersonID and arrival time – date Below is the query which I am able to execute into SQL
SELECT PersonID, LastName, FirstName, Address, City, max(date) from ( select PersonID, LastName, FirstName, Address, City, date from main_table UNION select PersonID, LastName, FirstName, Address, City, date from staging_table ) as t GROUP by t.PersonID;
but while executing into AWS Athena, I am getting following error, SYNTAX_ERROR: ‘”LastName”‘ must be an aggregate expression or appear in GROUP BY clause
Advertisement
Answer
I suspect that the other columns might differ and you actually want the full record from the most recent date. If this is the case, use row_number()
:
select p.* from (select p.*, row_number() over (partition by personid order by date desc) as seqnum from ((select PersonID, LastName, FirstName, Address, City, date from main_table ) union all (select PersonID, LastName, FirstName, Address, City, date from staging_table ) ) p ) p where seqnum = 1;
This select one row per PersonId
with the most recent date. The columns come from the most recent row.