RE: Simple Display file problem -- RPG400-L

From: Alan Campin

Actually the opposite is true.

1. New SQL I/O system is 5 to 7 times faster than DDS file.


My benchmarks showed the opposite.  SQL INSERT is 10-100 times slower than
the native WRITE.  You will see some performance improvement using prepared
statements, but native WRITE is still faster. Please provide an example
program and your benchmarks.

2. New SQL Index system is 2 to 3 times faster than DDS indexes.


"New index system"?  Other than the concept of an EVI, I know of no changes
to the basic index concept.  Again, please provide examples.

3. SQL has a new query optimizer that is a lot more efficient.


Yup.  SQL is great for queries.

4. Reading in only the data you need is always going to be faster than
reading the entire record. A lot of reasons for this primarily having to
do with buffers. IBM optimizes the moves but byte moves are just not
going to be significant. Record I/O is the most expensive thing in the
system.


This is also simply untrue in my experience.  The record-level access is the
same.  The data must still be read from disk, and in all but the most
extreme examples you will read in all of the data from a record the vast
majority of the time.  Reading directly into a data structure as you can in
recent versions of RPG makes this even faster.  SQL adds overhead which is
only negated when you reduce the number of trips under the MI layer, as you
do in set-based queries.

As always, please provide examples.  Create a file, and CHAIN to a specific
record, then use an SQL statement (either a SELECT or an OPEN CURSOR, FETCH,
CLOSE) to retrieve say, half the fields.  Put these in a loop and time them.
Please let us know when SQL wins.

5. The other thing with SQL is that you can read multiple records into
an array mapped to a user space. You cannot do that with Record I/O. If
you are reading multiple tables with selects and reading in multiple
records into a user space, the performance is going to be way faster.
All the work is going to happen at the machine level. I gave an example
in a previous post.


Yes, if you're reading a lot of records from a file sequentially, you will
get performance improvements with SQL.

However, I estimate that less than 10% of my programming is about reading
data sequentially.  I may process by partial key: for example, all the order
detail records for a single order header.  But to do that in SQL I would
need a header/detail join, and if I read that into a table that would
duplicate the header information for every detail record, making LOTS of
extra moves.  At some point, SQL would still outperform native I/O, but my
guess is that the number would be rather large (and would depend a lot on
the number of levels of data).

If I didn't use a JOIN and instead used a separate SQL statement to read the
detail, there would be less wasted moves, although opening and closing the
cursor on the detail record for every header record would quickly negate
that benefit.

In short, what used to be true is just not true anymore and it is only
going to get better. SQL is present and the future.


No, what's true about computers is STILL true.  When reading a single
record, nothing is faster than a simple indexed access.  SQL adds overhead
which can only be justified when reduced by economies of scale.

Optimize your reads, read only the records you need, don't duplicate data.
One of my biggest problems with SQL is the fact that header/detail JOINs
duplicate the header data, a fact which is ignored by most SQL advocates.

The only reason I see to continue a lot of record I/O is severely
unnormalized databases.


ISAM access will be with us for a long time, because it's the best way to do
certain things.  SQL has its place, but not as a replacement for standard
database driven business logic.


Finally, there are just lots of things SQL can't do, or at least not easily.
For example, if I have an order header with three different types of
children (order detail, shipments, and invoices).  And some of those in turn
have their own detail.  There is no clean way to get all that information in
a JOIN.  You need a separate JOIN for each sibling relationship.  The
numbers grow exponentially as you add different relationships.  Whereas with
native I/O, I can simply move back and forth between relationships, position
the files as I need them, retrieve records from sub-relationships, all at my
discretion.

As a second example, SQL just falls apart when I have a situation where data
is stored in different places for different situations.  Let's take a simple
example: pricing.  When there is a list price in the item master with a
possible override at the customer level and a third check in a
date-sensitive deals-and-promotions file, the logic required to get that
data is easily coded in an RPG program using native I/O, while it's nearly
impossible to do in SQL.  If you use SQL to do single-record fetches, at
which point your program simply falls over due to the massive overhead,
while just trying to program the gigantic CASE statement required (much less
debug it!) is enormous -- and even then it may perform poorly!

Joe