× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



I just wanted to post an update to let everyone know that implementing Tim's
idea of using both the like and regexp_like instead of just the regexp_like
cut the job's runtime from 7 hours to 7 minutes.

-----Original Message-----
From: MIDRANGE-L <midrange-l-bounces@xxxxxxxxxxxxxxxxxx> On Behalf Of
Timothy P Clark
Sent: Tuesday, June 4, 2019 12:43 PM
To: Midrange Systems Technical Discussion <midrange-l@xxxxxxxxxxxxxxxxxx>
Subject: Re: RE: Performance issue with SQL regexp_like?

Ack! Just realized my typo in that rewrite suggestion. It's probably obvious
(and will certainly be the first time you try it), but it should
say:
left join source on srcdta like '%'||trim(field)||'%' and
Regexp_like(srcdta,'[^A-Za-z0-9_@#$]'||trim(field)||'[^A-Za-z0-9_@#$]','i')

Hope that helps,

Tim Clark
Dept 45X - DB2 for IBM i


From: smith5646midrange@xxxxxxxxx
To: "'Midrange Systems Technical Discussion'" <midrange-
l@xxxxxxxxxxxxxxxxxx>,
Date: 06/04/2019 11:10 AM
Subject: [EXTERNAL] RE: Performance issue with SQL regexp_like?
Sent by: "MIDRANGE-L" <midrange-l-bounces@xxxxxxxxxxxxxxxxxx>

Interesting suggestion of combining the two statements (like and
regexp_like). If/when the current run of the program completes, I'll
give
that a try to see what it does.


-----Original Message-----
From: MIDRANGE-L <midrange-l-bounces@xxxxxxxxxxxxxxxxxx> On Behalf Of
Timothy P Clark
Sent: Tuesday, June 4, 2019 12:00 PM
To: Midrange Systems Technical Discussion
<midrange-l@xxxxxxxxxxxxxxxxxx>
Subject: Re: Performance issue with SQL regexp_like?

My recommendation is to always start with Visual Explain when you have
a question about the relative performance of two queries.

If you look at the two statements under VE, you'll see that the
Regexp_like
version requires a "Complicated" node. This tells you that Regexp_like
is
implemented as a (system-provided) UDF. There is always extra overhead
for
the query engine to run a UDF. On the other hand, the LIKE clause is
implemented "inline" in the query engine. That's why the first version
runs
slower than the second version.

One thing you could try is to combine the predicates:
left join source on srcdta like '%'||trim(field)||'%' and left join
source
on

Regexp_like(srcdta,'[^A-Za-z0-9_@#$]'||trim(field)||'[^A-Za-z0-9_@#$]','i')

The query engine will generally run the LIKE predicate before it runs
the
REGEXP_LIKE predicate and will (presumably) reduce the number of calls
required to the UDF. This is correct only in cases like yours where
the
rows
selected by the REGEXP_LIKE is a true subset of the rows selected by
the LIKE.

As an aside: I'm assuming here that your query really does want the
predicate on the left join specification (LEFT JOIN...ON...), as
opposed
to
in the WHERE clause, though your statement about filtering out rows in
RPG
leaves me wondering. Otherwise, depending on the complexity of the
rest
of
your query, it may perform somewhat better to move the predicate to
the WHERE clause, since the query engine then doesn't need to process
all of
the
exception rows. But this may give you different results than you want.

Hope that helps,

Tim Clark
Dept 45X - DB2 for IBM i


From: smith5646midrange@xxxxxxxxx
To: "'Midrange Systems Technical Discussion'" <midrange-
l@xxxxxxxxxxxxxxxxxx>,
Date: 06/04/2019 08:30 AM
Subject: [EXTERNAL] Performance issue with SQL regexp_like?
Sent by: "MIDRANGE-L" <midrange-l-bounces@xxxxxxxxxxxxxxxxxx>

I have a rather complex SQL statement using regexp_like that is
basically
searching for a field in a source file.



Omitting all of the other stuff above and below.



The regexp_like is (Please ignore any typos. The source is on a
machine that I can't reach with the laptop with this email account)



left join source on Regexp_like(srcdta,
'[^A-Za-z0-9_@#$]'||trim(field)||'[^A-Za-z0-9_@#$]','i')



When I try to run it via interactive SQL., it runs for about 5
minutes.



I tried changing it to this to see if the performance was any better.



left join source on srcdta like '%'||trim(field)||'%'



This version returns in seconds. That is a huge difference!!!
Unfortunately, it finds lines that contain the value but may not be
what
I
am after. For example, if I am searching for ACCT, this also finds
ACCT01.



Is there an issue with regexp_like? Should I use the second method
and
then
add a regexp in the RPG code to filter out the ACCT01 lines?



Any ideas where I go to from here? Please do not suggest packages
like Hawkeye. This is a client's machine and we can't add packages
to

their machine so it has to remain native code.




--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: https://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxxxxxxxx
Before posting, please take a moment to review the archives at
https://archive.midrange.com/midrange-l.

Please contact support@xxxxxxxxxxxx for any subscription related questions.

Help support midrange.com by shopping at amazon.com with our affiliate link:
https://amazon.midrange.com


As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.