|
On 12-Aug-2014 17:45 -0500, Vinay Gavankar wrote:
<<SNIP>>
This is the full scenario:
File A is unique by Id, and has a trigger program for *INSERT and
UPDATE (records are not being deleted from the file). The trigger
program writes the ID and Before and After Images to a data queue.
File A is being updated by multiple batch as well as interactive
programs, and it is "possible" that same Id gets updated in rapid
succession (batch programs process hundreds of records per second).
The program I am working on reads the data queue and I want to
capture an audit trail by writing the Id, Seq# and the Before Image
of the record. There are multiple instances of this program running,
reading from the same data queue.
<<SNIP>>
It is also important that the records are written in the sequence
they were updated, that is why I wanted to avoid the trapping for
error on Write and looping, in the first place.
Adding this logic in the trigger program will of course solve the
problem, but that is out of question, as it will slow down all of the
batch programs.
This new information both about how the data for writing to the file was
derived [data being read from a data queue concurrently], and how the data
must be written to that file with a chronological order, complicates the
scenario. Prior suggestions [of mine, and as I recall those of others
also] are effectively nullified, without either some additional work to
effect this newly identified serialization requirement or instead changing
the design\implementation for the same ends.
As I now understand the scenario, the file to which the data is being
written, serves as an effective journal receiver for logging the
/sequential/ I\O operations performed against a particular file with a
trigger; the file with the trigger will be called the audited-file, and the
file having the effective audit-records written will be called the
log-file. The /sequence/ of the update activity performed against any
particular Id value in the audited-file, data as rows written to the
log-file, must reflect the order of the [triggered] I/O operations
performed against the [particular Id value of the] audited-file; each
successive update [to a particular Id value] should have [however seemingly
illogical] a decremented sequence number.
Serializing the writes merely to prevent duplicates and effecting a
decremented sequence is quite simply effected using techniques already
described in prior responses. Serializing the writes to *also* reflect a
chronology is not so simple. Although obtained serially and stored as
such, the data in the described scenario can be read concurrently, thus
lacks any predictable order when that data is then written to the log-file
*unless* each writer is serialized [reflecting the same order each
message\entry was removed from the data queue].
The problem of course is the asynchronous nature of the concurrent reads
from the data queue, giving rise to an unpredictable order of insertions
into the log-file. Intuitively, the invocation doing the insert operation
can not be responsible to determine the next sequence number.
A conspicuous means to resolve would seem to involve a timestamp [along
with what is a default and presumed-valid assumption that time will only
ever move forward], either taken when the I/O occurred or when the data was
removed from the queue; at either point in the process, the data is known
to be properly ordered. Each invocation performing the insertion of the
audit-row would then be required not only to add the new sequence number,
but also to ensure that the sequence reflected the information from the
timestamp. However that is not easily accomplished, because just as one
process does not know what order their own entry [that they dequeued] was
on the queue relative to any other [that another process dequeued], each
process is also unaware of the timestamp obtained by any other
[irrespective of the timestamp being set when the entry was enqueued or
when the entry was dequeued].
To mollify the requirement for the relevance of the timestamp at the
time of logging [writing to the log-file], the writer could simply store
the timestamp instead of a sequence number; the sequencing would be
deferred to read-time, for example using an OLAP query to establish a
ROW_NUMBER() with an ORDER over the timestamp column.
I am not familiar with using SQL, but are you all saying that if
this file were to be defined in DDL, then system will take care of
inserting the records in correct sequence, even if more than one
additional request was made while the first one was still being
completed? If so, will that command also return the Sequence number
just inserted, so that I can use it for some other purpose?
The SQL can define a file that will automatically populate a sequence
value, the Identity column, and the value of the just-inserted identity
value is available to the program [that just inserted the row] in a couple
different ways. However that is moot, as the described scenario can not
benefit directly from only that feature, as discussed above.
--
Regards, Chuck
--
This is the RPG programming on the IBM i (AS/400 and iSeries) (RPG400-L)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.