Skip to content
Advertisement

Spark SQL column doesn’t exist

I am using Spark in databricks for this SQL command. In the input_data table, I have a string for the st column. Here I want to do some calculations of the string length. However, after I assign the length_s alias to the first column, I can not call it in the following columns. SQL engine gives out Column ‘length_s1’ does not exist error message. How can I fix this?

CREATE OR REPLACE VIEW content_data (SELECT LENGTH(st) AS length_s, LENGTH(st)-LENGTH(REGEXP_REPLACE(seq,'[AB]','')) AS AB_c,
length_s - LENGTH(REGEXP_REPLACE(seq, '[CD]', '') AS CD_c, CD_c+AB_c AS sum_c
FROM input_data)

Advertisement

Answer

You can’t use aliases in the same select

So do

CREATE OR REPLACE VIEW content_data (
    SELECT 
        LENGTH(st) AS length_s
        , LENGTH(st)-LENGTH(REGEXP_REPLACE(seq,'[AB]','')) AS AB_c,
        LENGTH(st) - LENGTH(REGEXP_REPLACE(seq, '[CD]', '') AS CD_c
        , LENGTH(st) - LENGTH(REGEXP_REPLACE(seq, '[CD]', '') + LENGTH(st)-LENGTH(REGEXP_REPLACE(seq,'[AB]','')) AS sum_c
    FROM input_data
)
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement