REGEXP_COUNT uses International Components for Unicode, but if you use the
C API for that rather than going through the database, I would expect that
to be much faster.

On Sun, Nov 3, 2019 at 11:31 AM Tools/400 <thomas.raddatz@xxxxxxxxxxx>
wrote:

FYI: I cannot make regcomp() and regexec() working with character
classes such as "\s". I tried various things without success. Using
REGEXP_COUNT works like a charm but is incredible slow (200 times slower
than regexec()).

Therefore I posted the problem at the rpg400-l mailing list hoping to
get help there: "Regular expression (regcomp()) ccsid issue".

Thomas.

Am 02.11.2019 um 11:26 schrieb Tools/400:
Craig,

Interesting stuff. Thank you for letting us know.

Because of the "\s" issue, I assume that it is a ccsid problem. That is
what needs to be debugged. I hope that I can do that today or tomorrow.

Regards,

Thomas.

Am 01.11.2019 um 17:48 schrieb Craig Richards:
A slightly more efficient version might be

dcl-f(?>\s+)filea

or in your case
dcl-f(?> +)filea
or
dcl-f(?>[ ]+)filea

(I'm very surprised the \s suggested by David did not work - that's
pretty
standard stuff).

Essentially this is wrapping the one-or-more whitespaces \s+ with (?>)
which is called Atomic Grouping.

The \s+ is greedy which is to say it will grab as many whitespace
characters as it can and then look at the next part of the expression to
carry on matching (in your case the filea) If that fails to match, it
will
backtrack, so if it grabbed 3 spaces, it will drop one and then try to
match and so on until it can't backtrack anymore.

The atomic grouping stops that backtracking process - essentially once
it
gets past the closing parenthesis, it throws away all states so it
doesn't
go back and try with, say 2 spaces then one space.

Maybe not an issue for you and maybe not supported if not even \s is
supported but it's a good performance thing to be aware of for the
situations where it's obvious that once you've done a greedy match and
the
next bit has failed - there is no point in dropping the last character
of
the greedy match and retrying the expression again.

regards,
Craig



--
This is the Rational Developer for IBM i / Websphere Development Studio
Client for System i & iSeries (WDSCI-L) mailing list
To post a message email: WDSCI-L@xxxxxxxxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: https://lists.midrange.com/mailman/listinfo/wdsci-l
or email: WDSCI-L-request@xxxxxxxxxxxxxxxxxx
Before posting, please take a moment to review the archives
at https://archive.midrange.com/wdsci-l.

Help support midrange.com by shopping at amazon.com with our affiliate
link: https://amazon.midrange.com


This thread ...

Follow-Ups:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2019 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].