Skip to content
Advertisement

Find closest match to value in another table

I have a table_a with many rows and columns for each timestamp in PostgreSQL 13. I’m trying to find the row where the value in column X is closest to a benchmark value obtained from another table. This second table has only a single benchmark value for each timestamp. For each timestamp, I need to return most of the columns of table_a. The query below works fine when supplying the value for the benchmark directly.

How can I get the benchmark value from table_b to use in this query?

Simply substituting table_b.benchmark with (SELECT benchmark FROM table_b WHERE table_a.timestamp = table_b.timestamp) results in ‘relation “t1” does not exist’ error.

Could not figure out a working join either.

table_a:

+-----------------+-----+---------------+
|    timestamp    |  x  | other_columns |
+-----------------+-----+---------------+
| 2020-01-01 8:00 | 100 |               |
| 2020-01-01 8:00 | 200 |               |
| 2020-01-01 8:00 | 300 |               |
| 2020-01-01 8:00 | 400 |               |
| 2020-01-01 8:00 | 500 |               |
| ...             |     |               |
| 2020-01-01 9:00 | 100 |               |
| 2020-01-01 9:00 | 200 |               |
| 2020-01-01 9:00 | 300 |               |
| 2020-01-01 9:00 | 400 |               |
| 2020-01-01 9:00 | 500 |               |
| ...             |     |               |
+-----------------+-----+---------------+

table_b:

+-----------------+-----------+
|    timestamp    | benchmark |
+-----------------+-----------+
| 2020-01-01 8:00 |       340 |
| 2020-01-01 9:00 |       380 |
| ...             |           |
+-----------------+-----------+

Expected result:

+-----------------+-----+
|    timestamp    |  x  |
+-----------------+-----+
| 2020-01-01 8:00 | 300 |
| 2020-01-01 9:00 | 400 |
| ...             |     |
+-----------------+-----+

SQL query:

WITH date_filter AS (
    SELECT *
    FROM table_a
    WHERE timestamp >= {start_date} and timestamp < {end_date}
    )

SELECT DISTINCT t1.timestamp, t1.x, t1.etc
FROM date_filter AS t1

INNER JOIN (
    SELECT timestamp, MIN(ABS(x - (table_b.benchmark))) AS target_value
    FROM t1
    GROUP BY timestamp
    ) AS t2

ON t2.timestamp = t1.timestamp AND t2.target_value = ABS(x - (table_b.benchmark))
ORDER BY timestamp ASC;```

Advertisement

Answer

One option uses a lateral join:

select b.timestamp, a.x
from table_b b
cross join lateral (
    select a.*
    from table_a a
    where a.timestamp = b.timestamp
    order by abs(a.x - b.benchmark)
    limit 1
) a

You can also use distinct on:

select distinct on (b.timestamp) b.timestamp, a.x
from table_b b
inner join table_a a on a.timestamp = b.timestamp
order by b.timestamp, abs(a.x - b.benchmark)
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement