I have a Sales
table and a Period
table.
Sales table
+--------------+-------------+ | Country_Code | Period_Code | +--------------+-------------+ | CH | MAI_18 | | CH | JUN_18 | | NO | 2020-01-21 | | NO | 2020-01-21 | +--------------+-------------+
Period table
+--------------+-------------+ | Country_Code | Period_Code | +--------------+-------------+ | NO | 200121 | | NO | 200122 | +--------------+-------------+
I am getting the below error while I join these 2 tables.
Below is the sql I used. As per my understanding, since I have given the country filter as NO
in the 2nd line, it should first execute the 2nd line and then do the join only on NO
country. But internally it is considering the CH
country also which is causing to have the below sql fail.
SELECT DISTINCT SAL.COUNTRY_CODE,PER.PERIOD_CODE FROM (SELECT * FROM MYSCHEMA.SALES WHERE COUNTRY_CODE in('NO')) SAL JOIN MYSCHEMA.PERIOD PER ON SAL.COUNTRY_CODE=PER.COUNTRY_CODE AND TO_CHAR(TO_DATE(SAL.PERIOD_CODE,'YYYY-MM-DD'),'YYMMDD') = PER.PERIOD_CODE ORDER BY 1
I used the CTE
also which is behaving the same way. There could be some solution for this. I would really appreciate if some one can help on this to have the sql run without any errors.
Advertisement
Answer
Snowflake explicitly does transforms earlier than the SQL specification states, thus the transform of you text/variant data is happen before you expect. I have raised this issue with them in 2016, while at times they fix/alter some of the behavior, it can also change and thus start breaking.
In cases where the contents of the column is not all valid, which is your situation you should use the TRY_
version of commands, thus here use TRY_TO_DATE to avoid the error.
thus you should find this works:
SELECT DISTINCT SAL.COUNTRY_CODE,PER.PERIOD_CODE FROM (SELECT * FROM MYSCHEMA.SALES WHERE COUNTRY_CODE in('NO')) SAL JOIN MYSCHEMA.PERIOD PER ON SAL.COUNTRY_CODE=PER.COUNTRY_CODE AND TO_CHAR(TRY_TO_DATE(SAL.PERIOD_CODE,'YYYY-MM-DD'),'YYMMDD') = PER.PERIOD_CODE ORDER BY 1
I would tend to move the date transform into the sub-select (or CTE if you where using one of those) to make the join simpler, albeit, Snowflake do this for you, thus I would write it like:
SELECT DISTINCT sal.country_code, per.period_code FROM ( SELECT country_code, TO_CHAR(TRY_TO_DATE(SAL.PERIOD_CODE,'YYYY-MM-DD'),'YYMMDD') AS join_code FROM myschema.sales WHERE country_code in ('NO') ) sal JOIN myschema.period per ON sal.country_code = per.country_code AND sal.join_code = per.period_code ORDER BY 1
But given your example table, which I assume do not hold all the columns, just the needed to reproduce the problem, which should be able to be triggered via:
SELECT country_code, TO_CHAR(TRY_TO_DATE(SAL.PERIOD_CODE,'YYYY-MM-DD'),'YYMMDD') AS join_code FROM myschema.sales WHERE country_code in ('NO')