I have a task to create a table function that receives a string and sorts it into different columns. The string itself may vary, but the columns are supposed to be the same always.
The string is: '100^TEst¬200^TEst2¬300^Test3'
, but for example if I add “¬400^Test4” that should be in the result set as well.
Here is what I’ve managed to do so far.
if object_id('stringSplit1') is not null drop function stringSplit1 go create function stringSplit1(@input varchar(1000)) returns @outputtable table (ord varchar(1000), dta_1 varchar(1000), dta_2 varchar(1000)) as begin return end go select [1] as dta_1, [2] as dta_2 from ( select row_number() over(partition by id order by (select null)) as rn, value from (select value as id from string_split('100^TEst¬200^TEst2¬300^Test3', '¬')) as bo cross apply string_split(id, '^') as bk) t pivot( max([value]) for [rn] in ([1], [2]) ) as pvt
Advertisement
Answer
Firstly, although you can use STRING_SPLIT
the order of the value returned by it is not guaranteed. As you define the length of your parameter as 1000
in your function stringSplit1
this means we can safely use a function like DelimitedSplit8K_LEAD
which does guarantee the order by providing the ordinal position of each value. This means we can do something like this:
SELECT SSr.ItemNumber AS Ord, MAX(CASE SSc.ItemNumber WHEN 1 THEN SSc.Item END) AS dta_1, MAX(CASE SSc.ItemNumber WHEN 2 THEN SSc.Item END) AS dta_2 FROM (VALUES('100^TEst¬200^TEst2¬300^Test3'))V(YourColumn) CROSS APPLY dbo.DelimitedSplit8K_LEAD(V.YourColumn,'¬') SSr CROSS APPLY dbo.DelimitedSplit8K_LEAD(SSr.Item,'^') SSc GROUP BY SSr.ItemNumber;
If you want this as a functionm, use an inline table value function, not a multi-line TVF like you have done:
CREATE FUNCTION dbo.DatasetSplit (@Input varchar(8000)) RETURNS table AS RETURN SELECT SSr.ItemNumber AS Ord, MAX(CASE SSc.ItemNumber WHEN 1 THEN SSc.Item END) AS dta_1, MAX(CASE SSc.ItemNumber WHEN 2 THEN SSc.Item END) AS dta_2 FROM (VALUES(@Input))V(YourColumn) CROSS APPLY dbo.DelimitedSplit8K_LEAD(V.YourColumn,'¬') SSr CROSS APPLY dbo.DelimitedSplit8K_LEAD(SSr.Item,'^') SSc GROUP BY SSr.ItemNumber; GO SELECT * FROM dbo.DatasetSplit ('100^TEst¬200^TEst2¬300^Test3') DS;
If you want to parametrise your row and column delimiters, then you could do it like this:
ALTER FUNCTION dbo.DatasetSplit (@Input varchar(8000), @RowDelimiter char(1), @ColumnDelimiter char(1)) RETURNS table AS RETURN SELECT SSr.ItemNumber AS Ord, MAX(CASE SSc.ItemNumber WHEN 1 THEN SSc.Item END) AS dta_1, MAX(CASE SSc.ItemNumber WHEN 2 THEN SSc.Item END) AS dta_2 FROM (VALUES(@Input))V(YourColumn) CROSS APPLY dbo.DelimitedSplit8K_LEAD(V.YourColumn,@RowDelimiter) SSr CROSS APPLY dbo.DelimitedSplit8K_LEAD(SSr.Item,@ColumnDelimiter) SSc GROUP BY SSr.ItemNumber; GO SELECT * FROM dbo.DatasetSplit ('100^TEst¬200^TEst2¬300^Test3','¬','^') DS;
Definition of DelimitedSplit8K_LEAD
for completeness:
CREATE FUNCTION [dbo].[DelimitedSplit8K_LEAD] --===== Define I/O parameters (@pString VARCHAR(8000), @pDelimiter CHAR(1)) RETURNS TABLE WITH SCHEMABINDING AS RETURN --===== "Inline" CTE Driven "Tally Table” produces values from 0 up to 10,000... -- enough to cover VARCHAR(8000) WITH E1(N) AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ), --10E+1 or 10 rows E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front -- for both a performance gain and prevention of accidental "overruns" SELECT 0 UNION ALL SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4 ), cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter) SELECT t.N+1 FROM cteTally t WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0) ) --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found. SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1), Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,8000)) FROM cteStart s; GO