Re: Splitting a big file into smaller files -- MIDRANGE-L

On 19-Apr-2015 08:01 -0500, Darryl Freinkel wrote:

Yesterday, I reran the extracts and splits. The code was identical,
just a new run. This time everything worked correctly.

This machine is a pretty busy machine and my suspicion is that
something is going wrong when the system is busy.

I would not suggest there is no defect, but I will state that I could not find any defects for which the most probable keywords are RRN and INCORROUT. I would also expect both that the HIPers are being regularly applied as maintenance to the system experiencing the problem, and that any already-identified and corrected defect involving INCORROUT for RRN would have been included as a HIPer APAR correction.

IMO the still likely origin for the issue, per mention of use of the SQL DELETE statement [though IIRC, referred to as the /DELETE command/, noted as being used to /clear/ the_big_file], that on the occasions the failure is detected, the_big_file had not been cleared, and thus there were many deleted rows; and repeated in at least two followups, the_big_file is noted to have REUSEDLT(*YES).

I have seen this in the past with the buffering. I would then change
the force write ration to 1 and force the data to disk every
write/update.

A common misconception of the Force Write Ratio is that there is some relationship to the visibility of the data by the DB; there is none, with regard to the data /in the file/. The data is visible in the file per the Single-Level-Store, irrespective of the data being /forced/ to disk. The data being forced to disk is simply an assurance that the data is available despite a crash for which the memory is lost, by forcibly ensuring the data did not remain only in memory; but as SLS, in effect, the memory is the disk and the disk is the memory, at least while the system is active.

The visibility issue is a programming issue involving buffered data that has never even been sent to the database, thus is outside the purview of the database; the Data Management for the application may have a buffer of data that was not yet sent to the database and thus is not visible to the database, neither in memory nor on disk [which from the SLS perspective, are the same]. A smaller forced-write-ratio however, can effect an override to the buffering characteristics of the DM, such that [in effect] the DM could reduce the internal buffer size to match the force-write-ratio. That might be done, having inferred that the lower number must be more appropriate for that open; i.e. if the program was coded [explicitly or implicitly by the HLL compiler and run-time decisions] to buffer up to 60-records, but the opened member will force every 10-records due to a smaller FORCE() specification, then a logical inference by the DM is that the user has made allowance to lose only up to 10-records from memory, thus inconceivably why would they want to allow the loss of up to 60-records from the DM buffer, and so the buffer might be similarly restricted to allow for the loss of only 10-records.

Yet that same effect of making the DM buffer smaller, for which the data becomes directly visible to\within the database [rather than languishing in a buffer] is more appropriately established with the /Sequential Only/ feature [and\or RPG BLOCK keyword for an f-spec]. That is because reducing the buffer via the FRCRATIO() has a direct [negative] impact on, as an increase of, disk I\O operations; sure, there would still be more logical I\O requests for the same number of records in the smaller buffer, but the number of actual disk operations from which the records are forced to disk would be throttled according to the system rather than arbitrarily being performed explicitly after some small and constant number of records being written.

This seems to be a similar issue.

Seems quite a stretch to infer anything similar betwixt. The case of the_big_file being completely constructed, and completely populated post-/extracts/, then all of the /splits/ would see the data whether or not any of the data was paged-out\forced to disk or still remained only in main storage (memory).

What IMO more likey *could* be an issue, is either if the /splits/ started running concurrent to the /extracts/, or the pre-/extracts/ attempt to _clear_ had instead performed a standard DELETE [for which a fast-delete had not effected the same as CLRPFM] for which deleted records remained, and into which the /extracts/ would not necessarily fill-the-gaps; or both.

As the process worked this weekend, I am not going to use CPYF.

The currently implemented process has [AFaIK, in this thread] never been well-documented. Perhaps describing the processing that takes place, as an effective script of the actions performed, would help clarify the scenario; for any scripted requests that run concurrently, showing them as being /submitted/ work would make the asynchronous effect conspicuous; i.e. rather than presenting those as just the next-sequential-request, from which the likely inference is that the request is the next-synchronous-request.