RE: array handling -- MIDRANGE-L

The way I have been handling large arrays is to use DESCEND and load from
the start to the end by reading backwards thru the file.  (*hival setll
file) (readp file)  When I do the look up, I specify eq and low indicators.
Only looks thru until either condition is met.  If low comes on, I know I
did not get a hit.

Now if the system does a binary search, will specifying EQ/LOW cause a
sequential search?  I always thought it was sequential, there for I wanted
to stop when reaching a value less than the search value.  Would I improve
performance by removing the LOW?

This thread has opened my eyes even more.

TIA,

Christopher K. Bipes      mailto:ChrisB@Cross-Check.com
Operations & Network Mgr  mailto:Chris_Bipes@Yahoo.com
CrossCheck, Inc.                  http://www.cross-check.com
6119 State Farm Drive     Phone: 707 586-0551 x 1102
Rohnert Park CA  94928    Fax: 707 586-1884

-----Original Message-----
From: jt [mailto:jt@ee.net]
Sent: Sunday, December 02, 2001 3:40 PM
To: midrange-l@midrange.com
Subject: RE: array handling


Joe,

I thought I knew a thing about array handling...!  That loading backward is
a new one on me...  (Don't let it go to your head, though, because I find
out something new most everyday...;-)

But you said ASCEND causes a binary search...?!?  Sheesh...  When did THAT
happen...?!?  (I'd always thought it still used sequential search, and
ASCEND just allowed for *LT or *GT type lookups.)  That'd be two things,
today...:-)


I like the technique of not initialize the index.  Particularly useful when
you're reading data in semi-ordered.  For example, I've needed a primary
subtotal on FieldA and a secondary subtotal on FieldB.  But I needed some
data, on each record, from the FieldB master file (of about 200 records).
Array worked real well, in this scenario.

But worked even better when I left the starting index the same, because it
generally hit on the next FieldB in the array.  I just had to add one extra
lookup, to starting back at the beginning of the array, on the rare occasion
the first lookup didn't find a hit.  (I knew 99+% of the values in FieldB
were valid on the FieldB master.)  I don't know if this technique has a
name, but I thought of it as kind-of a round-robin lookup.  Given the data
patterns, probably quicker than a binary search.

(It still seems like a topic for the RPG400-L, but I'm not on that one
although, obviously, but I should be...:-)


jt

| -----Original Message-----
| From: midrange-l-admin@midrange.com
| [mailto:midrange-l-admin@midrange.com]On Behalf Of Joe Pluta
| Sent: Sunday, December 02, 2001 7:17 AM
| To: midrange-l@midrange.com
| Subject: RE: array handling
| Importance: High
|
|
| No, you're right on-base here, Rich.  You can significantly reduce your
| overhead this way; I use caching a lot for small tables.  Since the number
| of entries is so small, I don't think we'll be able to
| significantly reduce
| your processing time by the backloading technique, but let me
| explain it and
| see if you're interested in implementing it.
|
| A LOOKUP operation has to do "N" compare operations.  The worst case is an
| unsorted array where there is no hit; in this case, "N" is the number of
| entries in the array.  (Of course, if you're getting REALLY pedantic, it's
| actually doing twice that many compares, along with a number of increments
| equal to the size of the array, but I think you get the picture.)
|
| As your array gets larger, this can add up to a lot of cycles.  There are
| two ways to reduce the number of compares.  First is to order the array -
| that is, make sure the array is in sequence.  Knowing that, the
| compiler can
| generate a much more efficient LOOKUP algorithm (using a binary
| scan), where
| the maximum number of compares is actually ((number of entries)
| log 2).  For
| a 20 element array, that's five compares, or a potential savings
| of 75%.  On
| larger arrays, the savings is even more significant.  The only downside is
| making sure that all the unused entries at the end of your array
| have "high
| values", such as X'FF'.
|
| Now you run into the problem of unused elements.  Let's say you make your
| array 1000 elements long, but you only use 20.  In that case, most of the
| time is spent chceking unused elements.  Even with an ascending array, you
| still have to do 10 compares, twice the number required to handle 32
| entries.  One of the nice things about the LOOKUP operation in RPG is that
| you can specify a starting position in the array; it will only scan the
| elements from that index forward.  So, what you do is you load the array
| BACKWARD, starting with the last position in the array.  Then,
| whenever you
| do a LOOKUP, you start with the last position that you loaded.
| Depending on
| the number of entries and the size of your array, you can
| probably see that
| this can significantly reduce your processing.
|
| So what's the best for you?  In your case, since the number of elements
| doesn't fluctuate very much, you should be able to tune your array size to

| be close to the number of entries you need.  You may occasionally have to
| resize your array, and you MAY want to put a check in your maintenance
| program to notify you if someone tries to add more records to the
| file than
| your cache program expects.  But the backloading technique is probably
| overkill - it's more relevant when you have to define a really BIG array
| that may only have a small number of elements in it.
|
| On the other hand, since you ARE doing this for performance
| reasons, making
| your array ascending may be a smart move.  Regardless of the
| size, ascending
| arrays allows the compiler to generate the far more efficient
| binary search
| algorithm, which should reduce processing even in your case.
| Simply add the
| ASCEND keyword to the D-spec defining the array.  Then, before
| you load your
| array, do a MOVEA *HIVAL to preload the array with X'FF':
|
| D ARY1            S              2    DIM(20) ASCEND
| (...)
| C                   MOVEA     *HIVAL        ARY1
|
| Hope this helps a little.
|
| Joe Pluta
| www.plutabrothers.com
|

_______________________________________________
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@midrange.com
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/cgi-bin/listinfo/midrange-l
or email: MIDRANGE-L-request@midrange.com
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.