I have 3 tables and I need to check if they have a positive number in the “rate” column. Tables have different schema but they all have a column named “rate”. I am working with airflow and I created a DAG which has a task to check if the rate in not a positive number then fail. So in a simple word, I want my SQL fails if there is an incorrect rate in any of those tables. I have done this:
WITH t1 AS ( SELECT deliverydate, rate FROM table1 WHERE rate <= 0 ), t2 AS ( SELECT deliverydate, rate FROM table2 WHERE rate <= 0 ), t3 AS ( SELECT deliverydate, rate FROM table3 WHERE rate <= 0 ) SELECT 1 FROM ( SELECT * FROM t1 UNION ALL SELECT * FROM t2 UNION ALL SELECT * FROM t3 )
In this case, t1, t2 and t3 will report rows with wrong rate(0 or negative rates). And Select 1 fails if there is no wrong rate.
But my goal is: pass if all the rates are correct and fail if at least one record has wrong rate and show me that record to fix it.
SQL queries run on bigquery standard
Advertisement
Answer
I would suggest logic like this:
select (case when exists (select 1 from table1 where rate <= 0) then 'FAIL' when exists (select 1 from table2 where rate <= 0) then 'FAIL' when exists (select 1 from table3 where rate <= 0) then 'FAIL' else 'PASS' end)
This stops at the first failure and should be pretty efficient. For added performance, include an index on rate
in each of the subtables.
Some databases require a from
clause, such as from dual
.
I doubt this is the best way to do things, but this should generate an error only when there is an error condition using error()
:
select (case when exists (select 1 from table1 where rate <= 0) then error('Bad rate in table1') when exists (select 1 from table2 where rate <= 0) then error('Bad rate in table2') when exists (select 1 from table3 where rate <= 0) then error('Bad rate in table3') else 'PASS' end)