I am using AWS Athena, so functions are a bit limiting. But essentially I want to extract the first 5 consecutive and sequential numbers from a alphanumeric field.
From the first example, you can see it ignores the first 1 because there aren’t 4 trailing numbers. I want to find and extract the first 5 numbers that are given together from this field. The output field is what I am hoping to achieve.
Advertisement
Answer
This will find an exact sequence of 5 digits.
a sequence of less or more than 5 digits will be ignored.
^|D = Indication for the start of the text OR a non-digit character d{5} = 5 digits D|$ = A non-digit character OR indication for the end of the text
with t (Example) as (values ('Ex/l/10345/Pl'), ('Ex/23453PlWL'), ('ID09456//')) select Example, regexp_extract(Example, '(^|D)(d{5})(D|$)', 2) as Output from t
+---------------+--------+ | Example | Output | +---------------+--------+ | Ex/l/10345/Pl | 10345 | | Ex/23453PlWL | 23453 | | ID09456// | 09456 | +---------------+--------+