I have a csv file on HDFS and I am trying to create an impala table , the situation is it created the table and values with all the “
CREATE external TABLE abc.def ( name STRING, title STRING, last STRING, pno STRING ) row format delimited fields terminated by ',' location 'hdfs:pathlocation' tblproperties ("skip.header.line.count"="1") ;
The output is
name tile last pno
“abc” “mr” “xyz” “1234”
“rew” “ms” “pre” “654”
I just want to create table from csv file without quotes. Please guide where I am going wrong. Regards, R
Advertisement
Answer
A way to do that is creating a stage table that load the file with quotes and then with CTAS (Create table as select) create the right table cleaning the fields with replace function. As an example
CREATE TABLE quote_stage( id STRING, name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
+-----+----------+ | id | name | +-----+----------+ | "1" | "pepe" | | "2" | "ana" | | "3" | "maria" | | "4" | "ramon" | | "5" | "lucia" | | "6" | "carmen" | | "7" | "alicia" | | "8" | "pedro" | +-----+----------+
CREATE TABLE t_quote ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE AS SELECT replace(id,'"','') AS id, replace(name,'"','') AS name FROM quote_stage;
+----+--------+ | id | name | +----+--------+ | 1 | pepe | | 2 | ana | | 3 | maria | | 4 | ramon | | 5 | lucia | | 6 | carmen | | 7 | alicia | | 8 | pedro | +----+--------+
Hope this helps.