Skip to content
Advertisement

TSQL – New column value based on other columns with highest level of match

So I’ve got a mapping table with the following information:

Number      | FCODE     |  CCODE
---------------------------------------------

0*********  |   12345   |   1

01********  |   12345   |   2

012*******  |   *****   |   3

012*******  |   12345   |   4

0454******  |   12345   |   5

0123******  |   12345   |   6 

And another fact table with the following info:

Number     | FCODE
-------------------------------
0123456789 | 12345
0123456789 | 545243
0129456789 | 545243
0129456789 | 12345

Ideally I should add CCode in the fact table with the highest matching value of Number (first) & FCODE (second)

So for example the first record: Number: 0123456789 FCode: 12345 should result to have value CCode 6 to be added in the fact table

If number is 0129456789 & FCode 12345, CCode should become 4.

I tried working with a RANK function, but I’m missing something but cannot see what

Here is the SQL query I have, but does not do what I expected..

SELECT ltd.Number, ltd.FCode, td.CCode
FROM dbo.Test_Fact ltd LEFT JOIN
    (SELECT *, MYRANK = RANK() OVER ( ORDER BY Number DESC)
    FROM dbo.Test_Mapping) td
    ON SUBSTRING(ltd.Number, 1, LEN(LEFT(td.Number, charindex('*', td.Number)-1))) = LEFT(td.Number, charindex('*', td.Number)-1)
ORDER BY td.MYRANK

Any support is appreciated. Thanks in advance!

Answer

You can use values from the mapping table as regular expression patterns to match against values in a fact table.

And if i understand correctly, you need to get maximum CCODE for both matches mapping.Number with fact.Number and mapping.FCode with fact.FCode.

If so, you can select TOP 1 of matching rows from mapping table sorted by CCODE in descending order.

To get a final result you can use OUTER APPLY like this

WITH CTE AS (
    SELECT 
        ccode, 
        (replace(Number, '*', '') + '%') as number_pattern,
        (replace(fcode, '*', '') + '%') as fcode_pattern
    FROM mappings
)
SELECT
    f.Number,
    f.FCode,
    m.CCode
FROM 
    fact f 
OUTER APPLY (
    SELECT TOP 1 * 
    FROM CTE m
    WHERE f.Number LIKE m.number_pattern 
          AND f.fcode LIKE m.fcode_pattern
    ORDER BY m.ccode DESC
) m

Please, check a demo.

Result:

Number     | FCode  | CCode
:--------- | :----- | ----:
0123456789 | 12345  |     6
0123456789 | 545243 |     3
0129456789 | 545243 |     3
0129456789 | 12345  |     4

As per @HABO comment, to retain pattern’s length and symbol order, it’s more correct to use an underscore to replace an asterisk in the pattern. And then query would be like this

WITH CTE AS (
    SELECT 
        ccode, 
        replace(Number, '*', '_') as number_pattern,
        replace(fcode, '*', '_') as fcode_pattern
    FROM mappings
)
SELECT
    f.Number,
    f.FCode,
    m.CCode
FROM 
    fact f 
OUTER APPLY (
    SELECT TOP 1 * 
    FROM CTE m
    WHERE f.Number LIKE m.number_pattern 
          AND f.fcode LIKE m.fcode_pattern
    ORDER BY m.ccode DESC
) m

Working demo.

This query produces NULL as CCode for second and third rows in fact table, because there is no any match in the mappings table.

Number     | FCode  | CCode
:--------- | :----- | ----:
0123456789 | 12345  |     6
0123456789 | 545243 |  null
0129456789 | 545243 |  null
0129456789 | 12345  |     4