|
Consider the following REGEXP_INSTR to find strings that contain (say) a
contract number that should be 7 digits long, starting with a 3, and which
can have non-digit prefixes or suffixes.
with foo(bar) as (
values('22123453 '),
('32123456 '),
('#2212345@'),
(' A2212345'))
SELECT foo.bar
FROM
foo
WHERE
regexp_instr(foo.bar,'(?<!\d)2\d{6}(?!\d)') > 0
You will correctly get two out of the four records, with the first omitted
because of the trailing 3, and the second one omitted because of the
leading 3. Note especially that you get the last record even though the
number is right at the end (i.e. it's not followed by anything).
So the negative look-ahead and negative look-behind both work as expected:
"My 2nnnnnn number must not have a number in front of or behind it (and
that's still true if it has NOTHING in front of or behind it)."
However, if you attempt to extract the numbers, like so (adding only a
REGEXP_SUBSTR):
with foo(bar) as (
values('22123453 '),
('32123456 '),
('#2212345@'),
(' A2212345'))
SELECT foo.bar,
regexp_substr(foo.bar,'(?<!\d)2\d{6}(?!\d)')
FROM
foo
WHERE
regexp_instr(foo.bar,'(?<!\d)2\d{6}(?!\d)') > 0
You will get nothing.
If you take the negative look-ahead out, the REGEXP_SUBSTR will work, so
it's not a Show Stopper, but I'm puzzled. I'd like to understand why they
don't work the same way.
--
Arnie
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.