Skip to content
Advertisement

HIVE-SQL_SERVER: HadoopExecutionException: Not enough columns in this line

I have a hive table with the following structure and data:

Table structure:

CREATE EXTERNAL TABLE IF NOT EXISTS  db_crprcdtl.shcar_dtls
  ID  string,
  CSK string,  
  BRND string,  
  MKTCP string,  
  AMTCMP string,  
  AMTSP string,  
  RLBRND string,
ROW FORMAT DELIMITED
FIELDS TERMINATED BY 't'
LINES TERMINATED BY 'n'
STORED AS TEXTFILE
LOCATION '/on/hadoop/dir/'


-------------------------------------------------------------------------------
ID  |     CSK |          BRND |          MKTCP |            AMTCMP 
-------------------------------------------------------------------------------
782    flatn,grpl,mrtn  hnd,mrc,nsn    34555,56566,66455     38900,59484,71450   
1231   jikl,bngr        su,mrc,frd     56566,32333,45000     59872,35673,48933 
123    unsrvl           tyt,frd,vlv    25000,34789,33443     29892,38922,36781 

Trying to push this data into the SQL Server. But while doing so, getting the following error message:

SQL Error [107090] [S0001]: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: Not enough columns in this line. 

What I tried:

There’s an online article where the author has documented similar kind of issues. I tried to implement one of them Looked in Excel and found two columns that had carriage returns but this also doesn’t come handy.

Any suggestion/help would be really appreciated. Thanks

Advertisement

Answer

If I’m able to understand your issue, then it seems that your , separated data is getting divided into various columns rather one column on the SQL-SERVER, something like:

------------------------------
ID  |CSK |BRND |MKTCP |AMTCMP 
------------------------------
782    flatn grpl mrtn  hnd mrc nsn  345   56566 66455   38900 59484 71450   
1231   jikl bngr        su  mrc frd  56566 32333 45000   59872 35673 48933 
123    unsrvl           tyt frd vlv  25000 34789 33443   29892 38922 36781 

So, if you look on Hive there are only 5 columns. While on SQL-SERVER the same. This I presume as you haven’t shared the schema. But if that’s the case, then you see that there are more than 5 values are being passed. While the schema definition is only of 5 columns. So the error is populating.

Refer this Document by MS and try to create a FILE_FORMAT with FIELD_TERMINATOR ='t', like:

CREATE EXTERNAL FILE FORMAT <name> 
WITH (   
   FORMAT_TYPE = DELIMITEDTEXT,    
   FORMAT_OPTIONS (             
        FIELD_TERMINATOR ='t',
        | STRING_DELIMITER = string_delimiter 
        | First_Row = integer -- ONLY AVAILABLE SQL DW
        | DATE_FORMAT = datetime_format  
        | USE_TYPE_DEFAULT = { TRUE | FALSE } 
        | Encoding = {'UTF8' | 'UTF16'} )
  );

Hope that helps to resolve to your issue 🙂

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement