Re: Embedded SQL Ignoring OVRDBF -- MIDRANGE-L

Chuck,

Appreciate the time you’ve taken in providing your responses.

The code is inherited with precious little documentation and it rarely has any comment about “Why” it operates the way it does. Unfortunately, CLMAIN is also a critical application that is executed many times a day, but typically only by 3 users. All the executions are interactive and typically the user stays signed on for the entire shift but they run program other than CLMAIN. So far problem has occurred with only 1 of those users, but when it breaks people tend to get excited.

The downstream stack from the failure is extremely complicated. I’ve tried to scan most code for OVRDBF and DLTOVR but found very little and nothing obvious. Most of the downstream code is RPG and there appear to be few overrides, but since the code in CLMAIN does not loop back, I don’t see how the downstream code can impact anything.

I reviewed the user’s job log from the last successful execution of CLMAIN, and checked out the other programs that were executed before the CLMAIN that incurred the failure, and again I have found nothing obvious.

At the start of the failing program I’m going to put in an OPEN of A_FILE and check using the file info DS that A_FILEX is really being opened. If it isn’t I’ll do a DSPOVR and produce a DUMP in an attempt to distinguish between a SQL problem and an OVRDBF problem. Per your suggestion I’ll also review the SQL communication area and see if it shows the file opened in the cursor—if it does I’ll probably switch to checking that instead of doing an RPG open of the file. Or maybe I'll check both.

I think I’ll also do some scanning for OVRSCOPE(*JOB) in programs that could have run <before> CLMAIN, but I don’t think that has been used much. A_FILEX is a temporary copy the file and updated only in CLMAIN. A_FILE is the real file and is used in most code without any override, so I’m not hopeful, but you never know.

The PTF issue is out of my scope, but I think I can get a commitment to do it more often than we currently do…

Sam

On 5/28/2011 6:22 PM, CRPence wrote:

On 25-May-2011 16:59 , Sam_L wrote:

Has anyone experienced embedded SQL ignoring an OVRDBF and updating
the wrong file?

No. However the "fast delete" feature [using clear or alter-like
interface] in v5r4 and earlier used to implement high-percentage of row
DELETE activity did have an issue about honoring overrides that was
corrected by a PTF.

We have a CL program CLMAIN that contains conceptual code like this:

CPYF A_FILE to A_FILEX *REPLACE
OVRDBF A_FILE to A_FILEX
CALL RPG_A
CALL RPG_B
CALL RPG_C
Etc.
DLTOVR *ALL
RETURN

Not sure of call levels there; do any of the CALL RPG_x call the other,
or are all called successively from CLMAIN so that each is at the same
call level? If all at the same level, any could set but then fail to
delete a job-scoped override, thus accidentally affecting others just as
with increasing call levels.

Program RPG_C is SQLRPGLE. It opens a cursor over A_FILE for UPDATE
OF a couple of fields.

FWiW I typically would use bogus file names to override from, so if an
override is for some reason not properly enabled [e.g. the program was
invoked from a command line versus from the "controlling" CLP], then any
request that depends on the override would typically fail for a "file
not found" condition. So for example in RPG_C the cursor might be
defined over BOGUS FOR UPDATE OF instead of over A_FILE FOR UPDATE OF.

It iterates over the cursor conditionally doing “Update A_FILE set
field = value WHERE CURRENT OF …”. This should update A_FILEX
because of the OVRDBF.

Note that the override processing should be limited only to the OPEN
processing in that scenario, not the UPDATE. The override should be moot
for the noted UPDATE WHERE CURRENT OF Cursor_Name because that UPDATE
request applies to an already existing ODP for which no override
applies; i.e. the cursor is named and no new open should occur per use
of a SET to a value vs SET to an expression. Thus if the wrong file is
being updated in this case, almost certainly the wrong file was opened
for update by the OPEN cursor request, not because the UPDATE WHERE
CURRENT OF looked for and located the wrong file due to an override
being overlooked for a new ODP.

<<Occasionally>> an execution of program RPG_C updates A_FILE instead
of updating A_FILEX. I have determined this by analyzing the journal
records. So far I have failed to recreate the situation in a test
environment--A_FILEX gets updated as expected.

Good reason to try the BOGUS file name approach, and issue both a DSPJOB
and a DSPOVR request for the -204 on the OPEN; probably for the UPDATE
requests too, just in case they are implemented with a new open for some
reason, for example if an operand makes an expression versus just the
"set field = value" as implied... or the SQL just makes a poor decision
to open again when that is not supposed to be required [and then
speculation about the UPDATE perhaps "missing the OVRDBF" seems more
likely]. Calling a program that makes the job wait on a specific reply
to an inquiry message enables activating service\debug against the job,
beyond just the DSPJOB and DSPOVR, might be a nice approach.

I’m clutching at straws here, wondering if the SQL runtime is
occasionally missing the OVRDBF.

The information about what file was opened should be available just
after the OPEN, prior to any of UPDATE requests; in the SQLERRMC data I
believe.? Assuming the issue is not with the UPDATE, then testing if the
file name that was actually opened is not the value A_FILEX as expected,
then the program could react as described above for the sql0204, by
logging the active overrides, sending an inquiry, and await the specific
reply value. The same data in SQLERRMC for the OPEN may also be given
for the UPDATE.?

I don’t see how the OVRDBF could be removed, except by the DLTOVR at
the end of the program. The programs in CLMAIN run in the default
activation group so the OVRDBF is scoped to the call level. Some of
the programs use SQL, often with a mix of native IO. Some call CL
programs that may do overrides, but those overrides would be at a
different call level than CLMAIN and I can’t see how they could
effect it.

Any code requesting DLTOVR A_FILE LVL(*) at the same call level that the
OVRDBF was issued can remove that override, just as a new override at
the same level would replace the existing override. From the description
of this scenario, the CLMAIN seems to be the only place that can occur
except changing scope to *JOB, however...

Review for [mis]use of SECURE(*YES) which might exist in some code path
which occurs only when the failure is experienced; these can be at a
higher call level than CLMAIN. Similarly look for any use of
OVRSCOPE(*JOB) at any call level. Unless there is a DLTOVR LVL(*JOB),
then the job-scoped override would remain in the job after the incorrect
updating transpired, however from the description of the scenario, the
error seems to go undetected until later when possibly the job is
already done.?

Auditing command strings to record the use of the OVRDBF and DLTOVR
might be useful for a review after an occurrence. Starting a trace and
issuing DSPJOB before calling CLMAIN, then ending\spooling the trace
upon return, requested for a user who has experienced the occurrence
multiple times already might also be an approach for reviewing the
failure afterward; presuming their activity might again show the error,
or less desirable but more inclusive to unconditionally start, end, and
then spool trace for the call to CLMAIN.

CLMAIN is invoked from an RPG III menu program, probably via
QCMDEXC. CLMAIN can be run again and again from the menu, or may be
executed interspersed with execution of other menu programs.

This suggests the menu program could establish a conflicting override,
perhaps in preparation for the invocation of some program other than
CLMAIN? Having failed to specify the TOFILE(), a later override which
tries to just update other parameters will be superseded by the original
override which could have been TOFILE(*FILE).

We are on V6R1, but not up to date on PTFs.

Hopefully with plans to resolve. Since there were enhancements to the
UPDATE WHERE CURRENT OF support in the DB2 for IBM i 6.1 SQL there would
seem to be IMO, an increased possibility of a defect as compared to if
there had been no enhancements specific to that support. Hopefully
"catching" an incident of the incorrect behavior would either point to,
or assist to eliminate, a usage problem as origin.

Regards, Chuck