Skip to content
Advertisement

Is there a way to load multiple values of one column into new lines to a single row in sql

This is how my data looks,

This is how data need to be loaded into the table

Actually I have replaced | with n but its not being loaded in the next line instead its just loading by adding spaces. And I will loading data into table using python etl, so even postprocessing i can add. Any suggestions??

Advertisement

Answer

It’s the behaviour of a relational database, that can’t just display emp_id once per group and all its elements as one per line. Changing that way of displaying data is the front end’s privilege and responsibility, not the database’s. So do that in Python.

Having said that, Impala has the SPLIT_PART() string function, which returns the n-th token of a string separated by a separator character that you pass as an argument.

So, cross joining with a series of consecutive integers, and then applying SPLIT_PART(skills,'|',i) does what you need.

Actually, in my (never too) humble opinion, this is what you should do whenever someone throws a file with such an unhappy format at you to load into the database. Always verticalise comma/bar/semicolon/or_whatever separated lists of “values” using the technique below, and store the data verticalised.:

With two bar/comma separated columns, it may look like this:

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement