Oracle SQL regexp_substr number extraction behavior

Question

In a sense I've answered my own question, but I'm trying to understand the answer better: When using regexp_substr (in oracle) to extract the first occurrence of a number (either single or multi digits), how/why do the modifiers * and + impact the results? Why does + provide the behavior I'm looking for and * does not? * is my

Accepted Answer

So the regexp_count indicates there are FOUR substrings that match the d* pattern.The third of those is the &#8216;123&#8217;. The implication is that the first and second are derived from the W and space and what you have is a zero length result that &#8216;consumes&#8217; one character of the source string.select test,     regexp_count(TEST,'d*') Pattern2_c,     regexp_substr(TEST,'d*') Pattern2,    regexp_substr(TEST,'d*',1,1) Pattern2_1,    regexp_substr(TEST,'d*',1,2) Pattern2_2,    regexp_substr(TEST,'d*',1,3) Pattern2_3,    regexp_substr(TEST,'d*',1,4) Pattern2_4 from (select '123 W' TEST from dual    union    select 'W 123' TEST from dual    );Oracle has a weird thing about zero length strings and null.The result doesn&#8217;t &#8220;feel&#8221; right, but then if you ask a computer deep philosophical questions about how many zero length substrings are contained in a string, I wouldn&#8217;t bet on any answer.

Advertisement

Answer