MIDRANGE dot COM Mailing List Archive



Home » MIDRANGE-L » January 2014

Re: Record locking question



fixed

Vinay:

You might get better answers on the COBOL400-L list at midrange,com ...

Reading your text description of the problem. why allow two separate jobs to both execute this same program at the same time? Most of the "transaction-based" systems I am aware of, that use a "transaction file" and then run some some batch program(s) to read the "transaction file" and update the "master" file(s) restrict this process to only allow one job at a time to run the update program(s).

You can accomplish that in several ways:

1. ensure that these update jobs are always submitted to a
single-threaded job queue -and-
2. have the program issue an ALCOBJ to obtain an *EXCLusive lock on the
"transaction file" for the duration of the program / job -or-
3. have the program issue an ALCOBJ to obtain an *EXCLusive lock on the
*PGM object itself for the duration of the program invocation,
releasing the lock via DLCOBJ just before the program exits.
4. etc.

Hope that helps,

Mark S. Waterbury

> On 1/12/2014 1:53 PM, Vinay Gavankar wrote:
I am trying to figure out why the system behaved in 2 different ways when
it encountered a locked record. I was hoping for some insight from the
knowledgeable people in the group.

I apologize in advance for a rather long description and if it is in he
wrong forum.

Here is the information which I think is pertinent:

2 jobs were running the same Cobol program, which reads records from a
transaction file member, does a bunch of updates on some other files and
then deletes the record from transaction file.

One job was running for some time when the second job started and due to
some issue (not relevant to the problem at hand), opened and started
processing the same member of the transaction file as the first job.

After a few minutes, both the jobs went into a loop, due to a record locked
in 2 different files by the other job. Let us say the 2 files were FILEA
(which was the transaction file) and FILEB (which was another file being
updated).

Now both jobs have been ended and I am trying to figure out what happened
based on the job log.

Here is the relevant sections of the program:

SELECT FILEA ASSIGN TO DATABASE-FILEA
ORGANIZATION IS INDEXED
ACCESS MODE IS DYNAMIC
RECORD KEY IS EXTERNALLY-DESCRIBED-KEY
WITH DUPLICATES
FILE STATUS IS WS-FILE-STATUS.

SELECT FILEB ASSIGN TO DATABASE-FILEB
ORGANIZATION IS INDEXED
ACCESS MODE IS DYNAMIC
RECORD KEY IS EXTERNALLY-DESCRIBED-KEY
FILE STATUS IS WS-FILE-STATUS.

PROCEDURE DIVISION.
..
..
Perform PROCA.
..
..

PROCA.
..
..
START FILEA KEY IS NOT <
EXTERNALLY-DESCRIBED-KEY
NOT INVALID KEY
PERFORM PROCB THRU PROCB-EXIT
END-START.

PROCB.
READ FILEA NEXT RECORD
AT END
SET C-EOF TO TRUE
GO TO PROCB-EXIT
END-READ.
..
Perform PROCC
..
DELETE FILEA
..
CALL 'XYZ' (This program does not access FILEA or FILEB)
GO TO PROCB.

PROCB-EXIT.
EXIT.


PROCC.
..
..
START FILEB
KEY IS NOT LESS THAN EXTERNALLY-DESCRIBED-KEY
INVALID KEY
SET NO-RECORD TO TRUE
NOT INVALID KEY
READ FILEB NEXT RECORD AT END
SET EOF TO TRUE
END-READ
END-START
PERFORM PROCD UNTIL EOF OR
Field1 > Field2
..
..
WRITE FILEB-RECORD.

PROCD.
READ FILEB NEXT RECORD AT END
SET EOF TO TRUE
END-READ.


In the above code, program locks a record from FILEB in PROCD, and keeps it
locked (and this is an issue I will be fixing), until next record from
FILEA starts processing and PROCC is called again. Based on the data in the
file, FILEB does not reach AT END condition, perform PROCD ends based on
values of Field1 & Field2.

This is what I deduced from the job logs when both jobs were looping:

Job 1 had completed processing record number (say) 20000 from FILEA and
deleted it (this was not actually in the job log, but was deduced). The
job log had only one message saying "Record 20002 was not available as it
was allocated to Job 2". Job was calling program 'XYZ' over and over again,
but the I/O count of FILEA was not changing, and there were no more
messages were being written to the job log.

Record 20001 was also processed and deleted, presumably by Job 2.

Job 2 job log had a message saying "Record nnn from FILEB was not
available, as it was allocated to Job 1". This message was being written to
the job log every 1 minute.

As per my thinking, the only way Job 1 could loop and call XYZ over and
over is by jumping to tag PROCB, where it would try to read FILEA. So why
wouldn't the IO count change?

Job 2 was looping over perform PROCD, and was writing "Record locked"
message to job log every minute, and I would expect the same behavior from
Job 1.

Another thought I had was that once READ NEXT fails (due to locked record),
then all subsequent Read Next should fail, as the pointer would be lost. In
which case, how did Job 2 kept on looking for the same record every time it
looped?

Any ideas why it did what it did?

Even after Job 2 was killed, Job 1 continued looping with no more messages
to the job log.






Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2014 by MIDRANGE dot COM and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available here. If you have questions about this, please contact