Skip to content
Advertisement

Match specific or default value on multiple columns

raw_data :

name account_id type element_id cost
First 1 type1 element1 0.1
Second 2 type2 element2 0.2
First 11 type2 element11 0.11

components:

name account_id (default = -1) type (default = null) element_id (default = null) cost
First -1 null null 0.1
Second 2 type2 null 0.2
First 11 type2 element11 0.11

I seek to check whether the cost logged in raw_data is the same as that in components for a given combination. They need to be joined on column name. Remaining fields in raw_data are always populated. In components, any row can be a combination of specific values and the default values.

I seek to match the columns from raw_data to components wherever I find a match and otherwise need to use the default value to get the cost. I failed with left join and union and IN.

E.g. For the first row in raw_data table with name “First”, I do not have account_id = 1 in the components table. So I need to go with account_id = -1.

Match as many specific values as found in components, Otherwise resort to default values.

Advertisement

Answer

I think one way you could do this is something like:

SELECT *
FROM
(
SELECT rd.name, rd.account_id, rd.type, rd.element_id, rd.cost raw_cost, c.account_id component_account_id, c.type component_type, c.element_id component_element_id, c.cost component_cost,
  row_number() OVER (PARTITION BY rd.name, rd.account_id, rd.type, rd.element_id
   ORDER BY 
    CASE WHEN c.account_id <> -1 THEN 1 END
    + CASE WHEN c.type IS NOT NULL THEN 1 END
    + CASE WHEN c.element_id IS NOT NULL THEN 1 END DESC) rd
FROM raw_data rd LEFT OUTER JOIN components c
  ON rd.name = c.name
   AND (rd.account_id = c.account_id or c.account_id = -1)
   AND (rd.type = c.type OR c.type IS NULL)
   AND (rd.element_id = c.element_id OR c.element_id IS NULL)
) iq
WHERE rd = 1

The idea here is to match on an actual match or the default. Then the row_number window function is used to prioritize the matches based on a count of how many columns actually matched (you said you don’t care about ties, so this doesn’t handle that). The outer query throws away the matches that aren’t the best.

With the sample data above, this could be an inner join, but I left it as a left join since that’s what was mentioned.

Here’s a fiddle of it working. Hopefully this is close to what you want.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement