MIDRANGE dot COM Mailing List Archive



Home » MIDRANGE-L » February 2013

Re: how to retrieve records and make sure they arrive in RRN sequence



fixed

On 13 Feb 2013 11:48, Stone, Joel wrote:
Is UNION ALL any different than READE on the MEMO # as the only key?

The SQL operates as effective set theory. A union of two sets just has to include all tuples from both sets. The set {(1),(2)} is indistinguishable from the set {(2),(1)}; they are equivalent sets. Either of those sets could be the set returned by SQL when no ORDER BY was specified. A key has no relevance to UNION ALL, although a UNION might take advantage of an existing key to implement its distinct processing.... but that still does not imply the order of the final result set is predictable; i.e. the database still offers no specific collation *unless* the ORDER BY is specified.

I mean would SQL know to do anything differently than read by arrival
seq?

Yes. The SQL can use any path to the data. That statement alone confirms that the SQL can and does do other than retrieve with the sequential access method [thus other than producing order in arrival access].

An unordered SQL request is not limited to sequential access, and although unlikely [except index-only], the query could use a keyed access path to retrieve the data. The path chosen to create an unordered set could even make the final set appear to be ordered, but closer inspection may find that the presumption about the appearance was wrong; e.g. UNION DISTINCT with no ORDER BY. AFaIK there is no way to force the optimizer to choose and use only a path that is going to preserve arrival access order. The OPNQRYF does provide an interface like that, via KEYFLD(*FILE) when the single file being queried is not keyed, but that is not the SQL.

The database also can use multiple tasks [effective threads] to access the data. A query that gets all data from a file member could be split across multiple tasks [I/O parallel processing], where each task processes a specific number and group of sequential rows, from which the final set may have any unpredictable order. The CHGQRYA DEGREE(*NONE) I believe prevents [versus merely tempers the query from using] this type of parallel processing, but that has to be implemented as a prerequisite for each use of that query; as a failure to do so would could see the ordering change... but as I noted, I believe that is still possible from whatever access method is chosen by the optimizer. This had been the most typical origin I had seen for change from arrival, since parallel data access was enabled and used in the queries that had previously been consistently performing using arrival.

There is also Symmetric Multiprocessing (SMP), if that product is installed. The effects of that are easily prevented by CHGQRYA DEGREE(*NONE).

There is also effects from use of DB2 MultiSystem, if that product is installed.

I don't know much about the inner workings of SQL on DB2/400, but it
seems unlikely that the SQL engine would start in the middle of a
file and read backwards & forwards. Also unlikely that it would
scramble the sequence from arrival.

Even if there had never been anything seen other than arrival until parallel was activated, AFaIK there was and is nothing preventing the optimizer from choosing a path to the data, that in the end, effects something other than arrival sequence. A division of work into two, would do well to start in the middle, with one reading forward and one reading backwards... if there was similar optimization for reading backward as there is for reading forward, because the first access by both would likely hit the same page; more likely however, is that any such division would be one starting at the first row reading forward to the middle and the other starting in the middle+1row reading forward toward the end. But any division of work does not imply that the ordering of the output from each would be consistent.

The easiest thing to do, is to concede that without an ORDER BY the collation of rows from a query is *unpredictable* and thus the arrival sequence effect should never be assumed so as to prevent encountering an instance of a failed assumption. To assume that some undesirable outcome is *unlikely* versus accepting that the desired outcome can not be ensured, seems to me like folly.

Yes this could occur in RPG READE also (without a second key to
guarantee proper ordering of the records). But it doesn't fail -
rather it works well.

The ISAM provides a specification for ordering of duplicates. I believe the specifications are FCFO, LIFO, and FIFO, with the latter being the default, for which the appearance of rows [after ordering by the key the row remains a duplicate] is arrival.

Would SQL retrieval be any more likely to scramble the sequence than
RPG READE or Cobol or any other access method?

Yes, SQL is much more likely to return data in something other than arrival sequence because ISAM rules for duplicates do not apply to the SQL access. And the SQL has no requirement to effect any particular retrieval method, whereas the ISAM must follow the rules for ISAM. So the Sequential Access Method does exactly as defined for the request being made, while the SQL has no definition\rules with regard to what row is next if the SQL does not request an ORDER BY.






Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2014 by MIDRANGE dot COM and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available here. If you have questions about this, please contact