Re: More SQE CQE Theory. -- MIDRANGE-L

I'll try to help with this - I worked as a contract employee in the database performance lab at IBM in 2001, when SQE was first being developed, so I think I have a little credibility in these matters. Besides, Elvis said I was right, and he knows these things. LOL!

Anyway, your first point - yes, I suppose the goal, in general, is to run SQL statements through the new engine, not the classic engine. However, not everything has yet been brought into the new engine. The first phase of functionality in the new engine was to make complex queries from the TCP-H test suite run faster - get them into the range of speed of doing the same benchmark in AIX. It was found that the classic engine made some poor choices in the optimization, so that orders of magnitude of IO were being done on the iSeries compared to the pSeries.

One of the reasons that the access plans were not good was, the opitimizer did not have enough information to make an informed decision. So statistics were added, in addition to the file properties contained in indexes (LFs) and the PFs themselves. Things like histograms and frequent values were collected on tables. These all give the optimizer better information.

In addition, the design of the new engine was made rather object-oriented - now various steps in the access plan are done in nodes, and these can be plugged in much more easily. And new nodes can easily be inserted into the overall structure.

Look at this page http://www-03.ibm.com/servers/eserver/iseries/db2/sqe.html where you can find a lot of formation, including a link to an info APAR where recent additions to the engine are supposed to be found - don't know how current this is. But it should help give you some understanding - I think you have already seen some of this stuff. There is also a PDF presentation there that is excellent. One of the foils there says that optimization time is increased by about 15% when it goes back up to CQE - now is this significant? Maybe not - if the statement takes 5 hours to run and optimzation takes 1 minute instead of 50 seconds, who cares? Optimization is an issue in its own arena only when total processing time is short. What really matters is, what did the optimizer choose to do to get the data? That is where there a potential (IBM's word) for performance improvement in the SQE.

As to indexes, it really is not germane to where the statement is processed - indexes have been used ever since SQL was put on the AS/400 - go back to V3R2 database and query optimization manuals, and you will see the same basic stuff as you see now - the optimizer always needed appropriate indexes, in order to get the best access plan. Sometimes indexes only provided distribution stats about the data in a file, other times it is the sole IO, since it contains all the fields requested in the field list, other times it is used for bitmap processing. See the manuals for this information.

But SQE's use of indexes is not the thing - CQE also uses indexes - all SQL optimizers for all DB products use indexes (SQL Server, Oracle, MySQL, etc.).

The articles on good indexing strategy were written (or parts of them were) before SQE came on the scene. Just think - Centerfield Technology's SQL performance analysis products were around several years before 2001 - Elvis can give you details - and the data they used was captured by the CQE of the time - the only engine then.

Efficiencies vary, depending on the kind of statement you are running - the system will make the best decision it can, I think, and, as you say, more and more things will end up run in the new engine. This is a good thing but not something to get overly bound up in, IMO.

So get those indexes in place - both engines will use them. Create indexes with sort sequences to match the SRTSEQ option you want to use in your code. These indexes will be a little bigger than simple ones but will help the optimizer. But remember, too, every new index means more maintenance when a record is inserted/updated/deleted - every index over a table has to be updated. There are tradeoffs here.

As to user control of where a statement will be optimized, there is less of it - used to be a trick to use parallel processing settings to control some things - that may no longer be the case. There are a couple things in the QAQQINI file to look at. I guess just keep an eye on the memo to users for each new release, check out that info APAR, and then make sure you have good indexing strategy -

Here is a chapter from a paper on star schema optimization in the new engine - lots of good links here

http://www-03.ibm.com/servers/enable/site/education/wp/star/page_14.html

HTH
Vern

At 09:24 PM 5/6/2007, you wrote:

I'm still curious about the relationship between SQE & CQE
(http://archive.midrange.com/midrange-l/200705/msg00066.html). Can
someone please comment on the validity of the following perceptions (of
mine)?

* In all that I've read, I thought that the goal of SQL was to
have it run by the SQE and not the CQE.
* The reason for this is that the SQE uses indexes to make the
resolution of the SQL more efficient.
* However, at the moment there are a number of conditions
(reducing with every OS upgrade) under which the SQL will be routed to
the CQE.

The discussion in the earlier topic implied that there was no point in
getting hung up about CQE & SQE, implying that it didn't really matter.
That really confused me because of the amount of literature being
generated on SQL for the iSeries.

I guess I've been under the impression that larger efficiencies can be
made by using a good indexing strategy. But if the indexes can't be used
because of the despatch conditioning, then there's no point.
In other words, unless we can overcome the conditions that switch the
SQL to the CQE, there's no point in worrying about indexing.

Unless of course the CQE has some capacity to be managed by the
developer. Or is the CQE aware of logical file indexes?

Is this a reasonable 'newbie' summary?

Regards,

"This e-mail and any attachments to it (the "Communication") is, unless otherwise stated, confidential, may contain copyright material and is for the use only of the intended recipient. If you receive the Communication in error, please notify the sender immediately by return e-mail, delete the Communication and the return e-mail, and do not read, copy, retransmit or otherwise deal with it. Any views expressed in the Communication are those of the individual sender only, unless expressly stated to be those of Australia and New Zealand Banking Group Limited ABN 11 005 357 522, or any of its related entities including ANZ National Bank Limited (together "ANZ"). ANZ does not accept liability in connection with the integrity of or errors in the Communication, computer virus, data corruption, interference or delay arising from or in respect of the Communication."
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.