I’m creating an external table using ‘partitioned by’ but I get the following error:
'Invalid operation: column "#id" duplicated'
I first thought it could have something to do with the character #, but then I tried a different column and I got the same mistake. (In case anyone asks about #, I uploaded the files from a csv to S3 with header including # and I couldn’t change the column name when creating the table to a name without this #, otherwise it would bring null values only.)
CREATE EXTERNAL TABLE schema.table_name ( #id BIGINT, uf varchar(255) ) partitioned by (#id BIGINT, uf varchar(255)) row format delimited fields terminated by ',' stored as parquet location 's3://bucket/folder/'
Advertisement
Answer
Your problem is that you are trying to name a partition with a name that is already being used by a column.
As you can see here on the documentation, that is not allowed:
Create an external table and specify the partition key in the PARTITIONED BY clause. The partition key can’t be the name of a table column. […]
Just rename your partition and it should work.