I am using Spark in databricks for this SQL command. In the input_data table, I have a string for the st column. Here I want to do some calculations of the string length. However, after I assign the length_s alias to the first column, I can not call it in the following columns. SQL engine gives out Column ‘length_s1’ does not exist error message. How can I fix this?
CREATE OR REPLACE VIEW content_data (SELECT LENGTH(st) AS length_s, LENGTH(st)-LENGTH(REGEXP_REPLACE(seq,'[AB]','')) AS AB_c, length_s - LENGTH(REGEXP_REPLACE(seq, '[CD]', '') AS CD_c, CD_c+AB_c AS sum_c FROM input_data)
Advertisement
Answer
You can’t use aliases in the same select
So do
CREATE OR REPLACE VIEW content_data ( SELECT LENGTH(st) AS length_s , LENGTH(st)-LENGTH(REGEXP_REPLACE(seq,'[AB]','')) AS AB_c, LENGTH(st) - LENGTH(REGEXP_REPLACE(seq, '[CD]', '') AS CD_c , LENGTH(st) - LENGTH(REGEXP_REPLACE(seq, '[CD]', '') + LENGTH(st)-LENGTH(REGEXP_REPLACE(seq,'[AB]','')) AS sum_c FROM input_data )