SQL for performance (was RE: Changing database to SQL from DDS) -- MIDRANGE-L

Rick

I realize I didn't responded with what you were looking for, but I couldn't resist expressing my surprise that IBM are touting SQL as a silver bullet for a batch performance problem, especially an overnight long running batch process.

I'm probably going out on a limb here, but SQL is just about the last thing I would expect to help the situation you described. Of course 20 years of operations and tuning experience has probably blinded me to the benefits of SQL as a performance aid ;)

All of the benefits of SQL I've seen touted so far in this thread - with the possible exception of RI being removed from the program logic (and I'm unconvinced on that as well) - may all be nice to have but in all likelihood will not solve your problem, at least in my opinion .

One advantage I saw mentioned - automatic journalling - would be more likely to hurt performance as far as I can see.

The automatic blocking is a non-issue in my mind - this is one of the first things I would have considered in this situation especially given the file sizes you are talking about. Does SQL also take advantage of SETOBJACC to store critical small files in memory ?

The advantages of RI and all the updates being in one place - as with the other advantages mentioned - seems to me to be a two edged sword: there are less levers to pull when tuning. If it isn't efficient enough all your eggs are in one basket and your are forced to upgrade hardware to solve the problem.

My experience has been more along the lines that breaking large processes down into components, using server jobs, data queues, file blocking, parallel job streams etc. etc. is far more effective.

Large overnight batch jobs tend to make me suspicious regarding either the application or the database design anyway. I like to see multiple small real-time server driven processes as opposed to single threaded huge updates wherever possible.

Of course I know nothing about your situation so it may well be that I am talking through my southern region :)

While it is true that converting to SQL might make the problem go away, in reality this would be a side-effect of re-engineering the application rather than a direct benefit of SQL itself. It is also possible that re-engineering the application without converting an SQL approach might achieve the same result.

Your simple performance measurement of the difference approaches would appear to indicate that SQL itself will not magically outperform an RPG program. Given that you did not use blocking to enhance the RPG program (did I get that right ?) it would appear that it may have been able to outperform SQL if tweaked. I guess you were just comparing the basic vanilla output of the two approaches.

Sadly, without testing and measuring both approaches (and who gets that choice !) we are never going to know for sure what the real situation is, we'll just be left with our gut feel and experience to guide us, at least for the time being.

I can't help but think that whoever provided the performance recommendation is not an "AS/400 guy". I would expect that when the expected performance gains don't materialise you will be encouraged to move to DB2 on either pSeries or xSeries as the next stage of "the solution".

I would even go as far as to say that the recommendations you received are the kind of advice that is killing off the AS/400 by trying to solve a performance problem the same way it is attacked on completely different platforms - a stupid approach in my opinion as they do not leverage the AS/400s job management strengths.

I hope it all works as IBM have indicated it will. Incidentally was it Global Services or the iSeries technical guys that came up with this recommendation ?

regards
Evan Harris

The recommendations from IBM are as follows:

* SQL would be a better choice for long running batch processes because it
/automatically uses blocking operations for read only cursors and index and
table scans are parallel enabled through the use of the Symmetrical
Multiprocessing (SMP) feature of OS/400.

* Using  SQL for mass file updates minimizes I/O since only rows matching
the local selection criteria are read for update. In addition, DB2 UDB for
iSeries is capable of generating an updateable join.

* The SQL blocked FETCH and INSERT support provides an additional layer of
blocking over and above the physical and logical layers. This is application
layer blocking. SQL automatically blocks read and write operations at the
maximum blocking factor available for that release (Currently using a 128k
buffer vs. 4 - 8 for DDS).  The programmer need not be concerned about
OVRDBF calculations.  In essence, one FETCH or INSERT statement can read or
write a maximum of 32767 rows in one operation from the program to the Open
Data Path (ODP).  Native programs must issue multiple reads or writes to
move data to and from the ODP.

The long range proposal is to move to a three tiered approach where the
database is de-coupled from the application and the database is accessed
through a set of I/O modules.  Various aspects of the existing database
would be duplicated using views and indexes as I understand it.

Out of curiosity I created two programs to read through a physical file
performing a math calculation and date conversion on each record read.  The
first program was written using the READ op-code inside a Do loop.  The
second program was written using a blocked SQL fetch inside a DO loop.
Running on an 890 under V5R2 I could see no significant difference in speed
after processing over 200 million records from a physical file.

Right now I am just trying to gather information and evaluate some of the
suggestions.  I would really appreciate hearing of anyone else's
experiences.

Rick