Rick
I realize I didn't responded with what you were looking for, but I couldn't
resist expressing my surprise that IBM are touting SQL as a silver bullet
for a batch performance problem, especially an overnight long running batch
process.
I'm probably going out on a limb here, but SQL is just about the last thing
I would expect to help the situation you described. Of course 20 years of
operations and tuning experience has probably blinded me to the benefits of
SQL as a performance aid ;)
All of the benefits of SQL I've seen touted so far in this thread - with
the possible exception of RI being removed from the program logic (and I'm
unconvinced on that as well) - may all be nice to have but in all
likelihood will not solve your problem, at least in my opinion .
One advantage I saw mentioned - automatic journalling - would be more
likely to hurt performance as far as I can see.
The automatic blocking is a non-issue in my mind - this is one of the first
things I would have considered in this situation especially given the file
sizes you are talking about. Does SQL also take advantage of SETOBJACC to
store critical small files in memory ?
The advantages of RI and all the updates being in one place - as with the
other advantages mentioned - seems to me to be a two edged sword: there are
less levers to pull when tuning. If it isn't efficient enough all your eggs
are in one basket and your are forced to upgrade hardware to solve the problem.
My experience has been more along the lines that breaking large processes
down into components, using server jobs, data queues, file blocking,
parallel job streams etc. etc. is far more effective.
Large overnight batch jobs tend to make me suspicious regarding either the
application or the database design anyway. I like to see multiple small
real-time server driven processes as opposed to single threaded huge
updates wherever possible.
Of course I know nothing about your situation so it may well be that I am
talking through my southern region :)
While it is true that converting to SQL might make the problem go away, in
reality this would be a side-effect of re-engineering the application
rather than a direct benefit of SQL itself. It is also possible that
re-engineering the application without converting an SQL approach might
achieve the same result.
Your simple performance measurement of the difference approaches would
appear to indicate that SQL itself will not magically outperform an RPG
program. Given that you did not use blocking to enhance the RPG program
(did I get that right ?) it would appear that it may have been able to
outperform SQL if tweaked. I guess you were just comparing the basic
vanilla output of the two approaches.
Sadly, without testing and measuring both approaches (and who gets that
choice !) we are never going to know for sure what the real situation is,
we'll just be left with our gut feel and experience to guide us, at least
for the time being.
I can't help but think that whoever provided the performance recommendation
is not an "AS/400 guy". I would expect that when the expected performance
gains don't materialise you will be encouraged to move to DB2 on either
pSeries or xSeries as the next stage of "the solution".
I would even go as far as to say that the recommendations you received are
the kind of advice that is killing off the AS/400 by trying to solve a
performance problem the same way it is attacked on completely different
platforms - a stupid approach in my opinion as they do not leverage the
AS/400s job management strengths.
I hope it all works as IBM have indicated it will. Incidentally was it
Global Services or the iSeries technical guys that came up with this
recommendation ?
regards
Evan Harris
The recommendations from IBM are as follows:
* SQL would be a better choice for long running batch processes because it
/automatically uses blocking operations for read only cursors and index and
table scans are parallel enabled through the use of the Symmetrical
Multiprocessing (SMP) feature of OS/400.
* Using SQL for mass file updates minimizes I/O since only rows matching
the local selection criteria are read for update. In addition, DB2 UDB for
iSeries is capable of generating an updateable join.
* The SQL blocked FETCH and INSERT support provides an additional layer of
blocking over and above the physical and logical layers. This is application
layer blocking. SQL automatically blocks read and write operations at the
maximum blocking factor available for that release (Currently using a 128k
buffer vs. 4 - 8 for DDS). The programmer need not be concerned about
OVRDBF calculations. In essence, one FETCH or INSERT statement can read or
write a maximum of 32767 rows in one operation from the program to the Open
Data Path (ODP). Native programs must issue multiple reads or writes to
move data to and from the ODP.
The long range proposal is to move to a three tiered approach where the
database is de-coupled from the application and the database is accessed
through a set of I/O modules. Various aspects of the existing database
would be duplicated using views and indexes as I understand it.
Out of curiosity I created two programs to read through a physical file
performing a math calculation and date conversion on each record read. The
first program was written using the READ op-code inside a Do loop. The
second program was written using a blocked SQL fetch inside a DO loop.
Running on an 890 under V5R2 I could see no significant difference in speed
after processing over 200 million records from a physical file.
Right now I am just trying to gather information and evaluate some of the
suggestions. I would really appreciate hearing of anyone else's
experiences.
Rick
As an Amazon Associate we earn from qualifying purchases.