A colleague of mines encountered this problem while working on a COBOL program and eventually solved it at the application level. Still I am curious if it is possible to solve it on the data access level, with SQL. This is somehow related to this other question, but I’d like to use only ANSI SQL.
I’m looking for a single SQL select query that acts on a VARCHAR field that contains variable length CSV rows. The purpose of the query is to split every CSV field in its own result set row.
Here is an example with schema and data (and here is the fiddle):
CREATE TABLE table1 (`field` varchar(100)); INSERT INTO table1 (`field`) VALUES ('Hello,world,!') , ('Haloa,!') , ('Have,a,nice,day,!');
Here is the output I’d like to have from the query:
Hello world ! Haloa ! Have a nice day !
The CSV separator used is the comma, and for now I wouldn’t worry about escaping.
Advertisement
Answer
As far as I can tell, this is ANSI SQL:
with recursive word_list (field, word, rest, field_id, level) as ( select field, substring(field from 1 for position(',' in field) - 1) as word, substring(field from position(',' in field) + 1) as rest, row_number() over () as field_id, 1 from table1 union all select c.field, case when position(',' in p.rest) = 0 then p.rest else substring(p.rest from 1 for position(',' in p.rest) - 1) end as word, case when position(',' in p.rest) = 0 then null else substring(p.rest from position(',' in p.rest) + 1) end as rest, p.field_id, p.level + 1 from table1 as c join word_list p on c.field = p.field and position(',' in p.rest) >= 0 ) select word from word_list order by field_id, level;
This assumes that the values in field
are unique.
Here is a running example