Adam Glauser wrote:
Not to put words in Aaron's mouth, but I suspect that when he's using \s
and \S he's looking for whitespace and non-whitespace respectively. Do
the C regex APIs support Extended Regular Expression syntax,
specifically the shorthand character classes? [1]
\s and \S do *not* mean "whitespace" and "non-whitepsace" according to
POSIX standards. In POSIX (according to the link you provided, Adam)
the phrase '[:space:]' means whitespace, and there is no phrase for
non-whitespace.
It says that \s and \S are Perl extensions.
The same page, that you provided, Adam also says this:
"The precise syntax for regular expressions varies among tools and with
context; more detail is given in the Syntax section."
Aaron: Please don't say things like "it works in Java and not on the
IBM i". That's a silly statement, and I'm sure you see why... first
of all Java actually runs on the i! Secondly, so does Perl! PHP! etc,
etc.. If you downloaded and installed a PCRE library for the i, I have
no doubt that your regular expression would work. It's not a limitation
of the OPERATING SYSTEM... it's a limitation of the IBM-supplied
regcomp() API for ILE C.
Also, the original regular expression syntax on Unix systems (which is
where I learned regular expressions) ALSO doesn't have \S or \s having
the special meaning, nor did it have the POSIX extensions of [:space:]
-- though that was added later.
Perhaps IBM's biggest crime in this situation is it's HORRIBLE
documentation. The docs say NOTHING about what is and is not supported
in their regcomp() API. The only thing I can find in the docs is this:
"The functions regcomp(), regerror(), regexec(), and regfree() use
regular expressions in a similar way to the UNIX® awk, ed, grep, and
egrep commands."
Gee, thanks. Not the same as those commands (which, in themselves vary
widely) but "similar to" without any further explanation. Leaving us to
determine which regular expressions are and are not supported by
trial-and-error.
But -- my guess is that they only support the original Basic and
Extended regexp syntax... not the POSIX or Perl or Java, etc, extensions.
As an Amazon Associate we earn from qualifying purchases.