RE: Limiting SQL results returned -- MIDRANGE-L

-----Original Message-----
From: midrange-l-bounces@xxxxxxxxxxxx
[mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of CRPence
Sent: Wednesday, August 01, 2007 11:13 AM
To: midrange-l@xxxxxxxxxxxx
Subject: Re: Limiting SQL results returned

Yes that should work, but to those for which it might not
be obvious, with the _caveat_ that T1 generates a temporary
table of all rows in the original file [of the named fields &
expression]. From the original description it would not seem
to be an issue [given there might not even be five rows for
any one status], but... The number of rows might make such an
implementation prohibitive for some tables and/or environments.

Yep, ordering rand() could be time consuming.

For very large result sets, an alternate method might be
to use an external table function which returns a given
number of rows from that table for a specific key value.
That program could use row level access, to decide what rows
to return. Some index might be available and conducive to
generating a relatively quick sampling; although /random/
probably not nearly as easy as with rand() and ordering. If
the program is in its own activation group and can leave the
file open for the repeated calls for each status, the file
would be opened only for that named activation, thus avoiding
open overhead.

My initial thought for a random sampling from a large data set, would be having a user defined table
function (UDT) that returns a set of RRNs where the RRNs returned are calculated via
for x = 1 to #randomRecs
rrnset(x) = rand() * max(rrn(myfile))
endfor

Only issue here would be deleted records.

But the OP wanted random recs within status, which adds some complexity. Mostly depends of your
definition of random.

1) 5 random records from within each of set of records with status = 'X'

2) keep picking random records till you get (at least) 5 of each status.

Option 1 would be more difficult, rand() * countOfRecsWithStatus, gives you a record to pick, but
unlike the RRN above, isn't directly usable to access the record. Don't think you'd want to READE
through a file to get to the nth Record with status 'X'. Then again, maybe it wouldn't be so bad.
However, I think I'd consider building a work file ordered by status. You could know where each
status starts, so that programmatically, you could return a direct key.

Option 2 would be pretty easy. Downside, is it could take lots of random picks till you have 5 of
each status.

Just some thoughts...

Charles

This e-mail transmission contains information that is intended to be confidential and privileged. If you receive this e-mail and you are not a named addressee you are hereby notified that you are not authorized to read, print, retain, copy or disseminate this communication without the consent of the sender and that doing so is prohibited and may be unlawful. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please delete and otherwise erase it and any attachments from your computer system. Your assistance in correcting this error is appreciated.