I am using Spark in databricks for this SQL command. In the input_data table, I have a string for the st column. Here I want to do some calculations of the string length. However, after I assign the length_s alias to the first column, I can not call it in the following columns. SQL engine gives out Column ‘length_s1’ does not exist error message. How can I fix this?
x
CREATE OR REPLACE VIEW content_data (SELECT LENGTH(st) AS length_s, LENGTH(st)-LENGTH(REGEXP_REPLACE(seq,'[AB]','')) AS AB_c,
length_s - LENGTH(REGEXP_REPLACE(seq, '[CD]', '') AS CD_c, CD_c+AB_c AS sum_c
FROM input_data)
Advertisement
Answer
You can’t use aliases in the same select
So do
CREATE OR REPLACE VIEW content_data (
SELECT
LENGTH(st) AS length_s
, LENGTH(st)-LENGTH(REGEXP_REPLACE(seq,'[AB]','')) AS AB_c,
LENGTH(st) - LENGTH(REGEXP_REPLACE(seq, '[CD]', '') AS CD_c
, LENGTH(st) - LENGTH(REGEXP_REPLACE(seq, '[CD]', '') + LENGTH(st)-LENGTH(REGEXP_REPLACE(seq,'[AB]','')) AS sum_c
FROM input_data
)