Skip to content
Advertisement

How to use RegEx in the SQL function CHARINDEX to find the nth occurrence

The intent of the following algorithm is to extract certain set of fields from a key (in this example, extract first 2 fields), and it works. The fields are delimited by a colon:

declare @key nvarchar (max);
declare @pos int;
declare @fields nvarchar (max);

set @key = 'Field-1:Field-2:Field-3:Field-4:Field-5';
set @pos = charindex(':', @key, charindex (':', @key) + 1);
set @fields = left(@key, @pos - 1);

select @fields;

Result: Field-1:Field-2

Microsoft document says that the first parameter is an expression, but I think what they mean by that in the context of CHARINDEX is, that this expression should evaluate to a string literal; hence the following attempt to pass a RegEx to get the 2nd occurrence doesn’t work; obviously either it is not supported or I am using a bad syntax:

--match the second occurrence of the delimiter using RegEx
set @pos = charindex (':.*?(:)', @key);

In other words, is it possible to find the position of the nth occurrence of the delimiter in a given text using RegEx, so that I could avoid several nested CHARINDEX or a loop to parse? Keeping aside, if n is passed as a parameter, then I can’t even use static nesting anymore…

Thanks in advance for the help.

Environment: Microsoft SQL Server 2014 (SP3) Standard Edition (64-bit)

Advertisement

Answer

There is no easy way, only tricks, to extract nth substring of a string. Below is a set based, recursive CTE approach:

DECLARE @str NVARCHAR(MAX) = N'Field-1:Field-2:Field-3:Field-4:Field-5';
DECLARE @num INT = 4;

WITH rcte AS (
    SELECT str = @str
         , n = 1
         , p = CHARINDEX(':', @str, 1)
    UNION ALL
    SELECT str
         , n + 1
         , CHARINDEX(':', str, p + 1)
    FROM rcte
    WHERE n < @num AND p > 0
)
SELECT CASE WHEN p > 0 THEN SUBSTRING(str, 1, p - 1) ELSE str END
FROM rcte
WHERE n = @num;

If loop is an option then:

DECLARE @str NVARCHAR(MAX) = N'Field-1:Field-2:Field-3:Field-4:Field-5';
DECLARE @num INT = 4;
DECLARE @n INT = 0;
DECLARE @p INT = 0;

WHILE 1 = 1
BEGIN
    SET @n = @n + 1;
    SET @p = CHARINDEX(':', @str, @p + 1);
    IF @n = @num OR @p = 0 BREAK;
END;

SELECT CASE WHEN @p > 0 THEN SUBSTRING(@str, 1, @p - 1) ELSE @str END;

DB<>Fiddle

1 People found this is helpful
Advertisement