Skip to content
Advertisement

Using SSIS OR T-SQL Split a column of quoted & unquoted comma separated values into multiple columns

I have comma separated data in a column named C0.
The data in C0 looks like this:

C0
“Pacey LLC.”,213830ZZ,11/1/2017,11/1/2017,”297,311.74″,”2,371.40″,0.00,”1,325.18″,0.00,42.22,”123,986.56″
Mike The Miker,9814140VCD,12/1/2018,12/1/2018,”3,917,751.99″,”419,743.54″,”36,642.66″,”344,090.43″,0.00,10.00,”2,434,671.06″

And I want it to end up like this:

F1 F1 F3 F4 F5 F6 F7 F8 F9 F10 F11
“Pacey LLC.” 213830ZZ 11/1/2017 11/1/2017 297,311.74 2,371.40 0.00 1,325.18 0.00 42.22 123,986.56
Mike The Miker 9814140VCD 12/1/2018 12/1/2018 3,917,751.99 419,743.54 36,642.66 344,090.43 0.00 10.00 2,434,671.06

I’ve tried nested replaces, but couldn’t find a pattern to reliably search without regex which is T/SQL? I’ve also tried a TOKEN approach in SSIS by this feller, but neither fruitful.

The nested replace approaches got stuck on the money fields that are under 1,000 (like 0.00) and the SSIS TOKEN approach presumes all fields are quote delimited, which in my example they aren’t.

Advertisement

Answer

As you were told already, TSQL is the wrong tool for this. Nevertheless this can be done (at least for the set given). If this is a one-time action you might give it a try. If this is a re-occurring task in a real-life scenario I’d try to get the data in an appropriate format.

However, this would work for the given lines:

–Your data includes dates in a culture specific format (something really! bad)
–Better switch to ISO8601
–Setting the date format will help, but is NOT recommended

–the first cte will use APPLY together with a computed TOP()
–This will allow to get each single character, one by one.

–we continue with a recursive cte
–it will run through the string and find if we are within a quoted area or not

–this CTE performs a trick with TOP 1 WITH TIES together with ORDER BY a partitioned ROW_NUMBER()
–The result will include the final string of the recursion by ID

–The final SELECT uses a trick to split strings position- and type-safe

The result

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement