Skip to content
Advertisement

LEFT JOIN by closer value condition

I have this query

SELECT
    loc.proceso,
    loc.codigo_municipio,
    loc.codigo_concejo,
    loc.concejo,
    (CASE
        WHEN loc.poblacion IS NOT NULL THEN loc.poblacion
        ELSE pob.valor
    END) AS poblacion
FROM develop.031401_elecciones_dimension_localizacion_electoral AS loc
LEFT JOIN develop.031401_elecciones_dimension_proceso_electoral AS proc
ON loc.proceso = proc.proceso
LEFT JOIN develop.020101_t05 AS pob
ON loc.codigo_municipio = CAST(pob.cmun AS INT) AND pob.year = proc.anno_eleccion

In the second LEFT JOIN, I would like to change the second condition pob.year = proc.anno_eleccion so that it does not only search for the exact year when joining. Instead, I would like to get the closer year stored in my pob table. For example, the first year stored in pob is 2003, so I want all the entries in loc whose year is lower than 2003 to be matched with that value when performing the join. Also at the inverse, the last year stored in pob is 2020, so I want those entries in loc whose year is 2021 (or even greater), to be matched with the 2020 row from my pob table. When the exact year is contained in pob table, it should be used for the join.

Advertisement

Answer

1. If you want the nearest year to NOW

I don’t think of a direct join but you can try this one by using ROW_NUMBER() function to sort data by year and pick the first result to join: (WHERE rn = 1 picks the first index, so it prevents any duplicate)

LEFT JOIN 
   (SELECT T.* FROM 
(SELECT ROW_NUMBER() OVER (PARTITION BY pob.cmun ORDER BY pob.year DESC) AS rn,
   *
FROM develop.020101_t05) AS T
WHERE rn = 1) AS pob
ON loc.codigo_municipio = CAST(pob.cmun AS INT) AND pob.year = proc.anno_eleccion

2. If you want the nearest year to your data

Even it’s not best practice, you can join your data using comparison operators on join condition. Then, take the difference between two years, sort the difference ascending and pick the first result using ROW_NUMBER() function. See example:

SELECT * FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY a.Id ORDER BY a.Year - b.Year) AS RowNumber,
       a.Id,
       a.Year,
       b.Year,
       a.Year - b.Year AS YearDiff
FROM a
LEFT JOIN b ON a.Id = b.Id AND a.Year >= b.Year) AS T
WHERE RowNumber = 1
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement