Selecting most recent rows in a SQL query

I want to join two tables, selecting the most recent rows for an ID value present in table 1.

i.e. For each ID value in table 1, only return the most recently added row for an ID value. For example, table 1 looks something like this:

Columns: ID-value, date-added,      other-information
row 1:    ID_1,    21/2/2020-12:30, other_newer_information...
row 2:    ID_1,    21/2/1990-12:30, other_older_information...

So if the same ID value is found twice in this table, only return the more recent entry, row 1 in the above case.

I then want to join these rows with information present in a second table. e.g. table 2 looks something like this:

Columns: column-present-in-table-1, another-column-present-in-table-1, other-columns
row 1:   some_data,                 some_more_data...                  additional data
row 2:-  some_data, infor_2:        some_more_data...                  additional data
etc

My sql query below works as expected for joining the two tables
but what I can’t work out is how to only return the most recent rows from table 1 when duplicate ID values have been entered on multiple dates
Also not sure if the date filtering should occur as part of the SELECT or when first fetching data from table 1

From looking elsewhere in StackOverflow the suggestions are things like MAX(date_time) – but my understanding is that this will only return the maximum date time value, not the most recent row – correct me if I’m wrong. My query looks something like this:

SELECT
    id_1,
    info_1,
    info_2,
    date_time,
    info_3,
    info_4,
    max(info_3),
    min(info_4)
FROM table_1
INNER JOIN table_2
    ON table_1.info_1 = table_2.infor_1
    AND table_1.info_2 = table_2.infor_2
    WHERE id_1 in ("id1", "id2")
    AND info_3 = "10"
    GROUP BY id_1, info_1, info_2, info_3, info_4
    ORDER BY id_1, id_2, date_time DESC

Other suggestions on StackOverflow: SELECT TOP id_1...min(info_4) (gives syntax error), ORDER BY id_1... date_time DESC LIMIT 1 (only returns one row – i.e. most recent date time).

ROW_NUMBER() OVER (PARTITION BY id, ORDER BY date_time) AS 'row_number' returns a row number, not the most recent row.

Answer

So if the same ID value is found twice in my table, only return the more recent entry, row 1 in the above case.

You can use row_number():

select t.*
from (select t.*,
             row_number() over (partition by id order by date_time desc) as seqnum
      from mytable t
     ) t
where seqnum = 1;

I really have no idea what your query has to do with your question. If your “table” is really the result of the query, then just use a CTE or subquery:

with t as (
      <your query here>
     )
<query with row_number here>

Advertisement

Answer