MIDRANGE dot COM Mailing List Archive



Home » MIDRANGE-L » January 2013

Re: how to sort a file in place?



fixed

On 23 Jan 2013 10:42, Dan Kimmel wrote:
What does RGZPFM actually accomplish. We know only two things.

1) it eliminates deleted records

2) it changes the RRN to match the order of the index given in
KEYFILE.

We don't know that it actually arranges the records of the table
in anything resembling sequential order, though it does give that
appearance when using DSPPFM.

While I normally would agree and argue similarly, that the implementation may not match the effect, in this case the actual implementation is unlikely to change. And being a LIC implementation rather than an OS implementation detail, that is at least somewhat irrelevant because user code both remains blissfully unaware and can not incorrectly code dependency upon an implementation detail that might change.

I know that the physical order of data is arranged sequentially within the data segments of the dataspace, and the segments of the dataspaces are logically ordered sequentially regardless that they are physically scattered on DASD.

I know that DSPPFM obtains the data using the arrival access path. I for some time had /owned/ the code QNFBROWS that implements the feature.

We assume that it reorganizes data into some efficient structure.
We know this only because some sequential accesses have improved
performance. How that is actually accomplished probably has no
relationship to any kind of sequential organization of the physical
records.

I actually know rather than merely assume how the database always had organized the data; I am lucky that way :-) but anyone with STRSST D/A/D capability can dump a dataspace to see the effects. And just as the RGZPFM help text suggests, the reorganization of the data [if requested, beyond just compressing out deleted records] is accomplished by physically sequencing\collating the rows to match the Access Path from the KEYFILE specification. The LIC-implemented [i.e using the ALWCANCEL(*NO) of RGZPFM] reorganize feature effectively copies all of the data and creates new segments filled with the ordered data, and if the request is not canceled then the existing dataspace is assigned that new ordered collection of segments, and the LIC frees the storage by destroying the old data segments.

I know that for sequential access method the database will generally not fault database pages because retrieval via the arrival access path knows the next segment in [logical] sequence to be paged without faulting. So, the reorganized data enables better performance for sequential access specifically because the data is actually arranged in its physical ordering within and across logically ordered segments that match the arrival sequence access path.

CRPence on Wednesday, January 23, 2013 12:30 PM wrote:

On 23 Jan 2013 07:23, Stone, Joel wrote:
I would like to sort WORKFILE1 in place, preferably using SQL.

Something to consider is "What is the reason for wanting to do
so?"; and having shared that, contributors might offer
alternatives.

Is this possible?

In effect, yes, but not with SQL. The RGZPFM CL command provides a
means to reorganize the physical data in a database physical file
member.

Physical order of data is inconsequential to the relational model;
i.e. where data is unordered sets. And so for SQL as a language to
provide data access for the RDBMS, there is effectively nothing
that the SQL provides to physically order the data within a TABLE.
Collation is an attribute of the request to extract data [the
run-time SELECT], rather than an attribute of the physical data.
That is because there can be many different possible collations for
the same set of data; most notably, DESCending vs ASCending, but
also according to data encoding and language\locale preferences,
and across any variety of expressions\columns.

Or must I create another table with SQL CREATE and then CPYF it
back to the original file.

That is one of a number of ways to effect a desired physical
order. But the value from doing so is limited, except for a static
TABLE and for the arrival sequence access method.

The above should say "sequential access method"... which is using the "arrival sequence access path".

My preference is to use the RGZPFM with the KEYFILE parameter
naming an access path that defines my desired collation.






Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2014 by MIDRANGE dot COM and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available here. If you have questions about this, please contact