Skip to content
Advertisement

SQL get records having latest association matching a specific value

Context

Using PostgreSQL, I’m storing articles and their associated events. Here is a simple structure of the 2 tables for the sake of the example:

Articles table

+----+-----------+-----------------------+
| ID |   title   |      description      |
+----+-----------+-----------------------+
|  1 | article 1 | article 1 description |
|  2 | article 2 | article 2 description |
|  3 | article 3 | article 3 description |
|  4 | article 4 | article 4 description |
|  5 | article 5 | article 5 description |
+----+-----------+-----------------------+

Events table

+-----+-----------+--------------+----------------+--------------+
| ID  |   name    | eventable_id | eventable_type | created_at   |
+-----+-----------+--------------+----------------+--------------+
|  1  | validated |            1 | Article        | 2020-05-10   |
|  2  | reported  |            1 | Article        | 2020-05-11   |
|  3  | reported  |            2 | Article        | 2020-05-10   |
|  4  | reported  |            2 | Article        | 2020-05-11   |
|  5  | reported  |            2 | Article        | 2020-05-12   |
|  6  | reported  |            3 | Article        | 2020-05-20   |
|  7  | validated |            3 | Article        | 2020-05-21   |
|  8  | reported  |            4 | Article        | 2020-05-12   |
|  9  | moved     |            4 | Article        | 2020-05-13   |
|  10 | reported  |            4 | Article        | 2020-05-14   |
|  11 | moved     |            5 | Article        | 2020-05-13   |
|  12 | moved     |            5 | Article        | 2020-05-14   |
+-----+-----------+--------------+----------------+--------------+

Problem

Here I need to be able to get all articles which have their latest events being reported.

So for example, following the data above, we must only get:

  • article 1: because it has been validated then reported
  • article 2: because it only has been reported
  • article 4: because it has been moved then reported (same as article 1)

As you can see:

  • article 3: should not be return because its latest event is validated.
  • article 5: should not be return because its latest event is moved.

I can easily find all articles having a reported event. But how to get the ones having their latest event to be reported ?

Here is what I did try to do so far without success:

SELECT *
FROM articles a
INNER JOIN (
    SELECT *
    FROM events
    WHERE name = 'reported'
    ORDER BY created_at
    LIMIT 1
) AS e ON e.moderable_id = a.id AND e.moderable_type = 'Article'

We currently have:

  • 459 892 articles
  • 62 074 events

Advertisement

Answer

You could use a correlated subquery for filtering:

select a.*
from articles a
where 
    (
        select e.name
        from events e
        where e.eventable_id = a.id and e.eventable_type = 'Article'
        order by created_at desc 
        limit 1
    ) = 'reported'

We could also express this with a lateral join:

select a.*
from articles a
inner join lateral (
    select e.name 
    from events 
    where e.eventable_id = a.id and e.eventable_type = 'Article'
    order by created_at desc 
    limit 1
)  x on x.status = 'reported'
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement