On 30-Sep-2015 10:10 -0600, John McKee wrote:
On Wed, Sep 30, 2015 at 10:51 AM, Steinmetz, Paul wrote:
I looked at the flight recorder after the backup completed. There
So are we stating that a non BRMS save still writes to the BRMS
flight recorders? This doesn't appear correct.
are indeed records written to it.
Seems odd to me as well.
AIUI because MSE and BRMS are both installed, the Tape Exit gets
invoked per the former, and the former is obliged to invoke the BRMS per
the latter per having been registered as part of the installation. That
holds, despite the operation against the tape possibly being non-BRMS.
So essentially the BRMS is placed\registered-as in control of tape
inventory [because those two features have been installed] and therefore
BRMS is\must-be informed of the use of the tape device(s) [or MLB
devices] wherever\whenever the Tape Exit runs.
Chuck had an initial suggestion to remove
BRMS and then reinstall. Likely would have worked.
The re-install after the save activity only to reset the system to
the original configuration; i.e. prior to the next save, the
delete\remove action again would be necessary as part of preventive. I
believe I had noted, that given BRMS was not being utilized, seemingly
the LPP could be removed entirely; though backed-up prior to DLTLICPGM,
to allow a restore if\when deemed required, but restored from that
backup so as not to require re-applying maintenance [because the PTFed
product was saved\restored instead of the un-PTFed version of the LPP
Changing the attribute on the flight recorder files to not save is
simpler and maybe safer.
Like the aforementioned disabling technique as [presumed] preventive
would be required to be repeated for each full-system-save *if* the
product were re-installed afterward, that [maybe] holds true for the
technique with the FR files whereby the FR files might need to be
modified prior to each full-system-save. That is, before each GO SAVE
Opt-21, repeat the Change Attribute (CHGATR) request to set the *ALWSAV
to *NO; or at least verify the AlwSav attribute\setting for each is
still set to *NO. That ensures, at least prior to the system-save, that
these files still existed with that attribute, despite possibly having
been deleted\re-created by the BRMS in some previous activity. If the
BRMS might delete and re-create the file, but not maintain that
AlwSav=No attribute\setting, then the problem could recur, despite the
attempt at preventive; I do not know how nor when the BRMS feature
/switches/ between their _backup_ and primary FR files, and what effect
that might have on the attributes of those files.
Begs the question if allow save is default on 5.4 or newer and was
changed, or if default was later changed. Or, if some PTF at any
level changed some behavior to prevent the lock condition from
causing the backup to fail.
I wonder(ed) the same. I thought I would have found a PTF that
mentioned something if IBM had seen this problem [reported to them], or
someone noting [in a blog or discussion somewhere] the CHGATR could be
used as a presumed-preventive action; but I found nothing in searches of
the web. I could ask a [former] developer from the product, but I have
chosen not to bother any of them, especially for an inquiry about an
unsupported release. I figure if any of them ever read\follow this
list\NG, then they could respond themselves.
I wonder if instead of an explicit fix, the potential effect was at
least partially mitigated by /tmp being marked as AlwSav=No in more
recent releases? Or perhaps since some time, that /tmp/brms is getting
created as AlwSav=No? I have no systems to check how /tmp is created
upon installation [or at run-time, after deletion]; nor /tmp/brms or the
individual files. I do know that the public v5r3 system that I use has
both BRMS and MSE installed, and has AlwSav=Yes for all of /tmp,
/tmp/brms/, /tmp/brms/flightrec, and /tmp/brms/flightrec.bku, so I
suspect that system has the same potential for the issue. But again,
the nuances of the layout of the saved data prior to SAV and similarly
for the SAV itself, the identical [code levels of the] software can just
as easily encounter the difficulty as not; all depends [so I infer] on
the FR being included on the already-being-dumped descriptor and the
Tape Exit then calling BRMS to write to the FR concurrent with the
preparation of the next to-be-dumped descriptor. Seemingly that could
explain why the problem could be[come] persistent\pervasive on a system
though never seen prior, and that other systems may continue to never
see the problem even while having the potential to encounter the issue.
Bad thing about this fix is that, if needed, flight recorder just
isn't going to be available to restore.
While the FR feature is for logging of the current moment and for the
ability [for the dev mostly] to refer to past [BRMS] activity, the code
performing future activity [such as more saves or restores] cares not
[and in no way depends on], what was\is in the log files. Those files
could be deleted, and the code should not care, simply effecting the
creation on open\write; note: for this specific scenario however,
deleting the files is *not* desirable, as they need to exist so the
attribute can be set to AlwSav=No. That these files are not important
to the feature for effecting any work [other than logging] should be
somewhat conspicuous, given the files are stored in the /tmp directory,
and that can be [and often is] cleared upon each IPL [on many systems];
and especially conspicuous, they definitely will not exist in any
Disaster Recovery (DR) scenario [at least if\until the /tmp directory
got restored, or probably in most cases where BRMS is in use, when the
BRMS feature first ran and created them].