In Databricks am reading SQL table as
val TransformationRules = spark.read.jdbc(jdbcUrl, "ADF.TransformationRules", connectionProperties) .select("RuleCode","SourceSystem","PrimaryTable", "PrimaryColumn", "SecondaryColumn", "NewColumnName","CurrentFlag") .where("SourceSystem = 'QWDS' AND RuleCode = 'STD00003' ")
How can I parameterize SourceSystem
and RuleCode
in Where
clause
Was referring to: https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/sql-databases
Advertisement
Answer
if you import the spark implicits, you can create references to columns with the dollar $
interpolator. Also, you can use the API with columns to make the logic, it will be something like this.
val sourceSystem = "QWDS" val ruleCode = "STD00003" import spark.implicits._ val TransformationRules = spark.read.jdbc(jdbcUrl, "ADF.TransformationRules", connectionProperties) .select("RuleCode","SourceSystem","PrimaryTable", "PrimaryColumn", "SecondaryColumn", "NewColumnName","CurrentFlag") .where($"SourceSystem" === sourceSystem && $"RuleCode" === ruleCode) val ssColumn: Column = $"SourceSystem"
As you can see, the dollar will provide a Column object, with logic like cooperation, casting renaming etc. In combination with the functions in org.apache.spark.sql.function
will allow you to implement almost all you need.