×

Good News Everybody!

The new search engine is LIVE!

Please report any problems to david (at) midrange.com.




Wait, it just occurred to me.  The documentation at

https://www.ibm.com/docs/en/i/7.2?topic=predicates-regexp-like-predicate#rbafzregexp_like__regexp_likecontrol

in "Table 4. Set Expressions (Character Classes)" lists

Example              Description
[A-M]                Range - match any character from A to M. The characters to include are determined by Unicode code point ordering.
[\u0000-\U0010ffff]  Range - match all characters.

and from https://chortle.ccsu.edu/FiniteAutomata/Section07/sect07_11.html,

"Rule 3.  Ranges of Characters

To show a range of characters, use square backets and separate the starting character from the ending character with a hyphen. For example, [0-9] matches any digit. Several ranges can be put inside square brackets. For example, [A-CX-Z] matches 'A' or 'B' or 'C' or 'X' or 'Y' or 'Z'."

But apparently REGEXP_INSTR is treating [\x00-\x3f] as a list of 3 bytes, x'00', x'60', and x'3f'.

From what I've read after a lot of googling is that most regex implementations deal with strings of characters, not bytes, and therefore do not really support ranges of byte values.



On 8/15/2021 5:56 PM, John Yeung wrote:
On Sun, Aug 15, 2021 at 6:33 PM Peter Dow <petercdow@xxxxxxxxx> wrote:
values regexp_instr('abcdef-ghijk' || x'3f', '[\x00-\x3f]') returns 7,
which is what I expected.

values regexp_instr('- - - - - - - - - - -', '[\x00-\x3F]') returns 1,
which is NOT what I expected.
Why did you expect to find the hyphen in the first example but not in
the second?

John Y.


This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2026 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.