Skip to content
Advertisement

Join on multiple columns and in one of the integer columns join by choosing minimum difference

I got table t1 and i want to join it with table t2 below on columns a, b and c

+---------+---------+---------+
|a        |b        |c        |
+---------+---------+---------+
|473200   |1        |1.-1-1   |
|472400   |10       |1.-1-1   |
|472800   |10       |1.-1-1   |
|473200   |93       |1.-1-1   |
|472800   |26240    |1.-1-1   |
+---------+---------+---------+

t2

+---------+---------+---------+
|a        |b        |c        |
+---------+---------+---------+
|473200   |1        |1.-1-1   |
|472400   |10       |1.-1-1   |
|472800   |10       |1.-1-1   |
|473200   |93       |1.-1-1   |
|472800   |26250    |1.-1-1   |
+---------+---------+---------+

When I join only on a and c the result is

+---------+---------+---------+---------+
|t1.b     |t2.b     |a        |c        |
+---------+---------+---------+---------+
|93       |1        |473200   |1.-1-1   |
|1        |1        |473200   |1.-1-1   |
|10       |10       |472400   |1.-1-1   |
|10       |10       |472800   |1.-1-1   |
|26240    |10       |472800   |1.-1-1   |
|93       |93       |473200   |1.-1-1   |
|1        |93       |473200   |1.-1-1   |
|10       |26250    |472800   |1.-1-1   |
|26240    |26250    |472800   |1.-1-1   |
+---------+---------+---------+---------+

What I try to achieve is to add column b to ‘on’ clause so that join happens on minimum difference in b column.

Desired result

+---------+---------+---------+---------+
|t1.b     |t2.b     |a        |c        |
+---------+---------+---------+---------+
|1        |1        |473200   |1.-1-1   |
|10       |10       |472400   |1.-1-1   |
|10       |10       |472800   |1.-1-1   |
|93       |93       |473200   |1.-1-1   |
|26240    |26250    |472800   |1.-1-1   |
+---------+---------+---------+---------+

I saw something similar here

https://dba.stackexchange.com/questions/73804/how-to-retrieve-closest-value-based-on-look-up-table

but not sure how to apply to my case.

Advertisement

Answer

Join the tables and calculate the differences for column c, then use distinct on to return only one row per (a, c) ordered by the difference.

with joined as (
  select t1.a, t1.c, t1.b as b1, t2.b as b2, t2.b - t1.b as b_diff
    from t1
         join t2 
           on t2.a = t1.a
          and t2.b = t1.b
          and t1.b <= t2.b
)
select distinct on (a, c) b1, b2, a, c
  from joined
 order by a, c, b_diff
;
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement