On 19-Apr-2015 08:01 -0500, Darryl Freinkel wrote:
Yesterday, I reran the extracts and splits. The code was identical,
just a new run. This time everything worked correctly.
This machine is a pretty busy machine and my suspicion is that
something is going wrong when the system is busy.
I would not suggest there is no defect, but I will state that I could
not find any defects for which the most probable keywords are RRN and
INCORROUT. I would also expect both that the HIPers are being regularly
applied as maintenance to the system experiencing the problem, and that
any already-identified and corrected defect involving INCORROUT for RRN
would have been included as a HIPer APAR correction.
IMO the still likely origin for the issue, per mention of use of the
SQL DELETE statement [though IIRC, referred to as the /DELETE command/,
noted as being used to /clear/ the_big_file], that on the occasions the
failure is detected, the_big_file had not been cleared, and thus there
were many deleted rows; and repeated in at least two followups,
the_big_file is noted to have REUSEDLT(*YES).
I have seen this in the past with the buffering. I would then change
the force write ration to 1 and force the data to disk every
write/update.
A common misconception of the Force Write Ratio is that there is some
relationship to the visibility of the data by the DB; there is none,
with regard to the data /in the file/. The data is visible in the file
per the Single-Level-Store, irrespective of the data being /forced/ to
disk. The data being forced to disk is simply an assurance that the
data is available despite a crash for which the memory is lost, by
forcibly ensuring the data did not remain only in memory; but as SLS, in
effect, the memory is the disk and the disk is the memory, at least
while the system is active.
The visibility issue is a programming issue involving buffered data
that has never even been sent to the database, thus is outside the
purview of the database; the Data Management for the application may
have a buffer of data that was not yet sent to the database and thus is
not visible to the database, neither in memory nor on disk [which from
the SLS perspective, are the same]. A smaller forced-write-ratio
however, can effect an override to the buffering characteristics of the
DM, such that [in effect] the DM could reduce the internal buffer size
to match the force-write-ratio. That might be done, having inferred
that the lower number must be more appropriate for that open; i.e. if
the program was coded [explicitly or implicitly by the HLL compiler and
run-time decisions] to buffer up to 60-records, but the opened member
will force every 10-records due to a smaller FORCE() specification, then
a logical inference by the DM is that the user has made allowance to
lose only up to 10-records from memory, thus inconceivably why would
they want to allow the loss of up to 60-records from the DM buffer, and
so the buffer might be similarly restricted to allow for the loss of
only 10-records.
Yet that same effect of making the DM buffer smaller, for which the
data becomes directly visible to\within the database [rather than
languishing in a buffer] is more appropriately established with the
/Sequential Only/ feature [and\or RPG BLOCK keyword for an f-spec].
That is because reducing the buffer via the FRCRATIO() has a direct
[negative] impact on, as an increase of, disk I\O operations; sure,
there would still be more logical I\O requests for the same number of
records in the smaller buffer, but the number of actual disk operations
from which the records are forced to disk would be throttled according
to the system rather than arbitrarily being performed explicitly after
some small and constant number of records being written.
This seems to be a similar issue.
Seems quite a stretch to infer anything similar betwixt. The case of
the_big_file being completely constructed, and completely populated
post-/extracts/, then all of the /splits/ would see the data whether or
not any of the data was paged-out\forced to disk or still remained only
in main storage (memory).
What IMO more likey *could* be an issue, is either if the /splits/
started running concurrent to the /extracts/, or the pre-/extracts/
attempt to _clear_ had instead performed a standard DELETE [for which a
fast-delete had not effected the same as CLRPFM] for which deleted
records remained, and into which the /extracts/ would not necessarily
fill-the-gaps; or both.
As the process worked this weekend, I am not going to use CPYF.
The currently implemented process has [AFaIK, in this thread] never
been well-documented. Perhaps describing the processing that takes
place, as an effective script of the actions performed, would help
clarify the scenario; for any scripted requests that run concurrently,
showing them as being /submitted/ work would make the asynchronous
effect conspicuous; i.e. rather than presenting those as just the
next-sequential-request, from which the likely inference is that the
request is the next-synchronous-request.
As an Amazon Associate we earn from qualifying purchases.