I want to join two tables, selecting the most recent rows for an ID value present in table 1.
i.e. For each ID value in table 1, only return the most recently added row for an ID value. For example, table 1 looks something like this:
Columns: ID-value, date-added, other-information row 1: ID_1, 21/2/2020-12:30, other_newer_information... row 2: ID_1, 21/2/1990-12:30, other_older_information...
So if the same ID value is found twice in this table, only return the more recent entry, row 1 in the above case.
I then want to join these rows with information present in a second table. e.g. table 2 looks something like this:
Columns: column-present-in-table-1, another-column-present-in-table-1, other-columns row 1: some_data, some_more_data... additional data row 2:- some_data, infor_2: some_more_data... additional data etc
- My sql query below works as expected for joining the two tables
- but what I can’t work out is how to only return the most recent rows from table 1 when duplicate ID values have been entered on multiple dates
- Also not sure if the date filtering should occur as part of the
SELECT
or when first fetching data from table 1
From looking elsewhere in StackOverflow the suggestions are things like MAX(date_time)
– but my understanding is that this will only return the maximum date time value, not the most recent row – correct me if I’m wrong.
My query looks something like this:
SELECT id_1, info_1, info_2, date_time, info_3, info_4, max(info_3), min(info_4) FROM table_1 INNER JOIN table_2 ON table_1.info_1 = table_2.infor_1 AND table_1.info_2 = table_2.infor_2 WHERE id_1 in ("id1", "id2") AND info_3 = "10" GROUP BY id_1, info_1, info_2, info_3, info_4 ORDER BY id_1, id_2, date_time DESC
Other suggestions on StackOverflow: SELECT TOP id_1...min(info_4)
(gives syntax error), ORDER BY id_1... date_time DESC LIMIT 1
(only returns one row – i.e. most recent date time).
ROW_NUMBER() OVER (PARTITION BY id, ORDER BY date_time) AS 'row_number'
returns a row number, not the most recent row.
Advertisement
Answer
So if the same ID value is found twice in my table, only return the more recent entry, row 1 in the above case.
You can use row_number()
:
select t.* from (select t.*, row_number() over (partition by id order by date_time desc) as seqnum from mytable t ) t where seqnum = 1;
I really have no idea what your query has to do with your question. If your “table” is really the result of the query, then just use a CTE or subquery:
with t as ( <your query here> ) <query with row_number here>