Skip to content
Advertisement

Fill NULL and get distinct record

I am ingesting data from two different source (source A and source B). Although data may be coming from two different sources, there is a possibility that an ID for a record may exist in both sources.

The problem is, there are cases when one source is more complete than the other. In the example below, Source A has a NULL for column C, however Source B has a complete record.

I would like to enrich the data where I get a complete record back.

  • Note: the ‘Source’ field below is just for illustration purposes, and not an actual field in the solution.
ID A B C D Source
asc123 1 2 NULL 4 Source A
asc123 1 2 3 4 Source B

The end result should be as follows:

ID A B C D
asc123 1 2 3 4

I have attempting something like this, but have not been successful:

SELECT  ID
       ,A
       ,B
       ,COALESCE(A.C, B.C) AS C
       ,D
       ,Source
    FROM(
    SELECT * FROM (
        SELECT 'asc123' AS ID, 1 AS 'A', 2 AS 'B', NULL AS 'C', 4 AS 'D', 'Source A' AS 'Source'
        ) A
    UNION

    SELECT * FROM (
        SELECT 'asc123' AS ID, 1 AS 'A', 2 AS 'B', 3 AS 'C', 4 AS 'D', 'Source B' AS 'Source'
        ) B
    ) X

Advertisement

Answer

You can just aggregate to remove the NULL values:

with sampledata as (
    select 'asc123' AS ID, 1 AS A, 2 AS B, NULL AS C, 4 as D, 'Source A' AS Source 
    union all
    select 'asc123' AS ID, 1 AS A, 2 AS B, 3 AS C, 4 AS D, 'Source B' AS Source
)
select Id,
  max(A)A, Max(B)B, Max(C)C, Max(D)D 
from sampledata
group by Id

Note also that for column aliases, if you need to delimit them use [] or "", single quotes are for literal values.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement