Skip to content
Advertisement

Checking for different columns OR empty result PostgreSQL

I’m writing some query tests for dbt and I’d like to check that two queries return the same thing and that neither is empty. Since it’s a dbt test, I need to return a row if either of these conditions is false. So far I’ve come up with the following:

with query1 as (
    select id from table1
),
query2 as (
    select id from table2
)
select
    case
        -- if either query returns 0 rows, return 1
        when not exists(select * from query1) OR not exists(select * from query2) then 1

        -- if both queries aren't empty, check for different columns and return their id's
        else COALESCE(query1.id, query2.id) END as id
from query1
FULL JOIN query2
    ON query1.id = query2.id
WHERE query1 IS NULL OR query2 IS NULL

Theoretically, this should return a row with 1 if either query is empty and if neither are then it should return any columns that don’t belong to BOTH queries. When only query1 is empty or only query 2 is empty this works but if both are empty then it returns an empty table instead of 1. My hunch is that it’s something to do with the JOIN but I really don’t know enough about SQL to figure out why that’s affecting the original queries.

EDIT: Adding examples

No different rows

query1:      query2:      result:
|  id   |   |  id   |    |  id   |
|-------|   |-------|    |-------|
|   1   |   |   3   |    |       |
|   3   |   |   1   |

Different rows

query1:      query2:      result:
|  id   |   |  id   |    |  id   |
|-------|   |-------|    |-------|
|   1   |   |   4   |    |   3   |
|   3   |   |   1   |    |   4   |

Either table empty

query1:      query2:      result:
|  id   |   |  id   |    |  id   |
|-------|   |-------|    |-------|
|       |   |       |    |   1   |
|       |   |       |

Advertisement

Answer

I propose that the query returns a value like 0 in case any of the tables is empty because this can be distinguished form a valid id value of 1.

So, use UNION ALL for this case and check with EXISTS if any of the tables is empty:

WITH 
  query1 AS (SELECT id from table1),
  query2 AS (SELECT id from table2)
SELECT 0 id
WHERE (NOT EXISTS (SELECT 1 FROM query1)) OR (NOT EXISTS (SELECT 1 FROM query2))
UNION ALL
SELECT COALESCE(q1.id, q2.id) id
FROM query1 q1 FULL JOIN query2 q2
ON q2.id = q1.id
WHERE (q1.id IS NULL OR q2.id IS NULL)
  AND (EXISTS (SELECT 1 FROM query1)) AND (EXISTS (SELECT 1 FROM query2))

Note that for the sample data you posted you don’t really need the CTEs because you can select directly from the tables.

See the demo.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement