I'll try to help with this - I worked as a contract employee in the
database performance lab at IBM in 2001, when SQE was first being
developed, so I think I have a little credibility in these matters.
Besides, Elvis said I was right, and he knows these things. LOL!
Anyway, your first point - yes, I suppose the goal, in general, is to
run SQL statements through the new engine, not the classic
engine. However, not everything has yet been brought into the new
engine. The first phase of functionality in the new engine was to
make complex queries from the TCP-H test suite run faster - get them
into the range of speed of doing the same benchmark in AIX. It was
found that the classic engine made some poor choices in the
optimization, so that orders of magnitude of IO were being done on
the iSeries compared to the pSeries.
One of the reasons that the access plans were not good was, the
opitimizer did not have enough information to make an informed
decision. So statistics were added, in addition to the file
properties contained in indexes (LFs) and the PFs themselves. Things
like histograms and frequent values were collected on tables. These
all give the optimizer better information.
In addition, the design of the new engine was made rather
object-oriented - now various steps in the access plan are done in
nodes, and these can be plugged in much more easily. And new nodes
can easily be inserted into the overall structure.
Look at this page
http://www-03.ibm.com/servers/eserver/iseries/db2/sqe.html where you
can find a lot of formation, including a link to an info APAR where
recent additions to the engine are supposed to be found - don't know
how current this is. But it should help give you some understanding -
I think you have already seen some of this stuff. There is also a PDF
presentation there that is excellent. One of the foils there says
that optimization time is increased by about 15% when it goes back up
to CQE - now is this significant? Maybe not - if the statement takes
5 hours to run and optimzation takes 1 minute instead of 50 seconds,
who cares? Optimization is an issue in its own arena only when total
processing time is short. What really matters is, what did the
optimizer choose to do to get the data? That is where there a
potential (IBM's word) for performance improvement in the SQE.
As to indexes, it really is not germane to where the statement is
processed - indexes have been used ever since SQL was put on the
AS/400 - go back to V3R2 database and query optimization manuals, and
you will see the same basic stuff as you see now - the optimizer
always needed appropriate indexes, in order to get the best access
plan. Sometimes indexes only provided distribution stats about the
data in a file, other times it is the sole IO, since it contains all
the fields requested in the field list, other times it is used for
bitmap processing. See the manuals for this information.
But SQE's use of indexes is not the thing - CQE also uses indexes -
all SQL optimizers for all DB products use indexes (SQL Server,
Oracle, MySQL, etc.).
The articles on good indexing strategy were written (or parts of them
were) before SQE came on the scene. Just think - Centerfield
Technology's SQL performance analysis products were around several
years before 2001 - Elvis can give you details - and the data they
used was captured by the CQE of the time - the only engine then.
Efficiencies vary, depending on the kind of statement you are running
- the system will make the best decision it can, I think, and, as you
say, more and more things will end up run in the new engine. This is
a good thing but not something to get overly bound up in, IMO.
So get those indexes in place - both engines will use them. Create
indexes with sort sequences to match the SRTSEQ option you want to
use in your code. These indexes will be a little bigger than simple
ones but will help the optimizer. But remember, too, every new index
means more maintenance when a record is inserted/updated/deleted -
every index over a table has to be updated. There are tradeoffs here.
As to user control of where a statement will be optimized, there is
less of it - used to be a trick to use parallel processing settings
to control some things - that may no longer be the case. There are a
couple things in the QAQQINI file to look at. I guess just keep an
eye on the memo to users for each new release, check out that info
APAR, and then make sure you have good indexing strategy -
Here is a chapter from a paper on star schema optimization in the new
engine - lots of good links here
http://www-03.ibm.com/servers/enable/site/education/wp/star/page_14.html
HTH
Vern
At 09:24 PM 5/6/2007, you wrote:
I'm still curious about the relationship between SQE & CQE
(http://archive.midrange.com/midrange-l/200705/msg00066.html). Can
someone please comment on the validity of the following perceptions (of
mine)?
* In all that I've read, I thought that the goal of SQL was to
have it run by the SQE and not the CQE.
* The reason for this is that the SQE uses indexes to make the
resolution of the SQL more efficient.
* However, at the moment there are a number of conditions
(reducing with every OS upgrade) under which the SQL will be routed to
the CQE.
The discussion in the earlier topic implied that there was no point in
getting hung up about CQE & SQE, implying that it didn't really matter.
That really confused me because of the amount of literature being
generated on SQL for the iSeries.
I guess I've been under the impression that larger efficiencies can be
made by using a good indexing strategy. But if the indexes can't be used
because of the despatch conditioning, then there's no point.
In other words, unless we can overcome the conditions that switch the
SQL to the CQE, there's no point in worrying about indexing.
Unless of course the CQE has some capacity to be managed by the
developer. Or is the CQE aware of logical file indexes?
Is this a reasonable 'newbie' summary?
Regards,
"This e-mail and any attachments to it (the "Communication") is,
unless otherwise stated, confidential, may contain copyright
material and is for the use only of the intended recipient. If you
receive the Communication in error, please notify the sender
immediately by return e-mail, delete the Communication and the
return e-mail, and do not read, copy, retransmit or otherwise deal
with it. Any views expressed in the Communication are those of the
individual sender only, unless expressly stated to be those of
Australia and New Zealand Banking Group Limited ABN 11 005 357 522,
or any of its related entities including ANZ National Bank Limited
(together "ANZ"). ANZ does not accept liability in connection with
the integrity of or errors in the Communication, computer virus,
data corruption, interference or delay arising from or in respect of
the Communication."
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
As an Amazon Associate we earn from qualifying purchases.