× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Hi again

I wasn't really happy with my "partial" solution, and probably some of you
weren't either. Were you losing sleep over it? Perhaps getting into fights
at work, or turning to drink or drugs; seeing your relationships sour;
losing your job and turning to crime ...?

I don't want to be responsible for that sort of suffering, so I kept
researching and experimenting. I wanted the same pattern expression for the
SUBSTR as for the INSTR, and now I've managed it.

What I was overlooking was the extra parameters - specifically the 'group'.
Turns out there are two gotchas to be aware of:

1. If you don't specify the group you get the 0th group (the entire
string) not the 1st by default.
2. Although they are surrounded by parentheses, the look-arounds aren't
counted as groups.

Probably there are none of you are stupid enough to make mistake #2, but I
mention it just in case there's someone else like me out there.

So, the upshot is that I can bring the look-arounds back in, as long as I
specify that the first group (only) is to be returned. The following
extracts from field $str the nth instance (set to 3rd) of any number in the
three ranges I cited in my earlier example, regardless of where they are in
the string (beginning/middle/end) or what non-numeric character may precede
or follow them:

select
regexp_substr($str,
'(?<!\d)(1\d{6}|[2-3]\d{5}|4\d{4})(?!\d)', 1, 3, 'i', 1 ),
regexp_instr($str,
'(?<!\d)(1\d{6}|[2-3]\d{5}|4\d{4})(?!\d)', 1, 3 )
from (values('41234BK,212345 & 212349,44321 43210')) x($str)

Obviously I don't need the extra parms for the INSTR as it's not returning
anything.

--
Arnie



On Sun, Aug 22, 2021 at 1:50 AM Arnie Flangehead <arnie.flangehead@xxxxxxxxx>
wrote:

I have a partial resolution, which I will share with you all,
because that's just the sort of big-hearted guy I am.

The solution will work with the original example, but in this post my
example changes to be more compact, and to search for several different
ranges at once (which is the actual requirement I have):

values regexp_substr('312345 41234 1123456',
'(?<!\d)(1\d{6}|[2-3]\d{5}|4\d{4})(?=\D|$)',1,3)

The relevant part is the change from a negative look-ahead for a digit:
(?!\d) in my original example, to a positive look-ahead for either a *non*-digit
or end of line: (?=\D|$)

So, the above statement means: Substring out:
- A seven-digit number starting with 1
- OR a six-digit number starting with 2 or 3
- OR a five-digit number starting with 4.
- And only if it's not preceded by a number or followed by a number.
- And don't return the preceding/following character being checked for
(the motivation for look-arounds).

In this particular example I'm searching for the third instance and I've
deliberately put the three numbers out of order in the string to make a
better test. So far it's handled anything I've thrown at it.

I think I just have to accept that the rules are different for
REGEXP_INSTR than they are for REGEXP_SUBSTR.
--
Arnie



As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.