Re: Array efficiency (was:Dynamic Arrays) -- RPG400-L

Are you through?  Thank you sir, may I have another?

----- Original Message -----
From: "Scott Klement" <klemscot@klements.com>
To: <rpg400-l@midrange.com>
Sent: Wednesday, October 16, 2002 4:23 PM
Subject: Re: Array efficiency (was:Dynamic Arrays)




On Wed, 16 Oct 2002, Steve Landess wrote:
>
> Sure, files have overhead.  But so do arrays when using indexed elements
in
> an RPG program.

Everything has overhead.   But arrays usually have less, depending on the
circumstance, of course.

For example, if you need to retrieve the 100th record of a file, compared
with retrieving the 100th element of an array...

The array is clearly more efficient.   All the array has to do is add a
number to a pointer, and then access the memory in question.   For the
file, it has to calculate the position, which should take about the same
amount of CPU as the array did, but then it has to translate that to a
sector or multiple sectors on disk, position the read head and read the
data into memory.   Once it's done that, you still have to access the
memory, just like you did for the array.

Okay, so you're going to argue "But, the record may already be paged into
memory!".   Okay, great.  That means you don't have to wait for the slow
speed of the hard drive.   But, you still have to copy the data from the
paged in memory bank to your programs local storage.

Even in a best-case scenario, the file will be slower than an array, when
both are accessed by "record number"

The speed of accessing a record by key is a different matter.  In this
case, you'd have to search the array, element by element, until you found
the correct entry.   This is slower than keyed file access because the
ALGORITHM of index searching is faster than sequential searching.

If you used a binary search (such as that provided by the new %lookup BIF)
the array access still might end up being faster.   If you actually built
an index over the memory in the array, the array access would CERTAINLY be
faster.

But, there the problem would be the time it takes to build the index.  And
that's the big advantage of the disk... it's non-volatile.  You can build
the index ahead of time, and as long as you keep it up-to-date, it's still
there later when you need to access the data.  You don't have to wait for
it to be built each time.

>
> And, in the final analysis, the file or user index approach DOES give
> the ability to dynamically allocate the space for the list of data that
> you would otherwise have in an array.
>

True.  But we can't do a final analysis, since nobody has bothered to
define what we're using this array and/or file for.  We don't know if the
poster was planning to do keyed lookups on the data, or if it's more
important to keep memory usage low vs. keeping speed up.

> In terms of array performance, I'm talking about resolving references in
> the code to indexed array elements...when you use a reference to ARRAY,X
> in the program, this has to be resolved (every time) to the address of
> the array element.

First of all, it's ARRAY(X) not ARRAY,X.  Welcome to RPG IV!
The computer has the address of X.  It has to copy the contents of that
memory into a register.  It then has to multiply that times the size of
each array element, and add the result ot the address of ARRAY.  Yes,
that takes more time than accessing a normal variable.

But, in comparison to accessing a file from disk, it's an insignificant
amount of time.  Doing disk access, you still have X, and you still have
to load it's contents into a register, and multiply that register times
the record size to get the byte position of the start of the record.  You
then have to convert that byte position into a sector on disk, and an
offset from the start of that sector.  You still have to seek the disk to
that sector, read it, and copy the appropriate part to memory.  Then move
on to the next sector, etc until the entire record has been read.  And
that's without the overhead of indexes!

Even if that record is already paged into memory, you still have to do the
same calculations to access it from memory.   The read time will be much
less because the memory is faster than disk, but you still have to find it
and copy it into your programs' RAM.   It's still much faster to access
an array than a file on disk.

>
> >From the Performance Management Redbook, section 10.8.8.2:
> http://www.redbooks.ibm.com/redbooks/GG243723/css/GG243723_275.html
>

And again, this is referring to the calculations involved in accessing an
array element, vs. a variable directly.   It's also referring to the
difference between a literal and a variable.   Sure, it's going to be
faster to access directly instead of calculate a position in an array.

But, an array is still much faster than a file....

> For the sake of discussion, lets say that I'm using a work file INSTEAD of
> an array in my program.
>
> What is more efficient to determine if the value is NOT in the list:  KEY
> SETLLWKFILE and using %FOUND or an = indicator, OR ' ' LOKUPARRAY,1 (or
> %LOOKUP) to find an unused element in the array?

Just as I said before, the advantage of using a file is that you can build
the index ahead of time and then use it when you need it.   That's where
you get the speed increase.

>
> Since SETLL doesn't actually perform an I/O, I would guess that it is
quite
> efficient.  I may be totally wrong...but, having grown up on a farm, my
mom
> always said that I would argue with a fence post!

SETLL doesn't perform I/O?   Huh?   How does it locate a record on disk
without reading from disk?  (reading being the "i" in "i/o")   Does it
have psychic powers?


_______________________________________________
This is the RPG programming on the AS400 / iSeries (RPG400-L) mailing list
To post a message email: RPG400-L@midrange.com
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/cgi-bin/listinfo/rpg400-l
or email: RPG400-L-request@midrange.com
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.