Re: RCLSTG on IPL? -- MIDRANGE-L

On 22-Nov-2013 17:14 -0800, Steinmetz, Paul wrote:

I had an issue some damaged objects with 3rd party software, ran
RCLSTG, still had some issues.

Likely because Reclaim Storage can not correct damage. Though "some issues" is quite nebulous; so more obviously, only if, had 'some issues with /damaged objects/ remained.' That is of course, because the only correction for the "physical damage" types of hard\full (object damage) and soft\partial (data damage) [to be clear, not /logical/ damage] is to delete the damaged object although data-recovery actions possible for soft-damage.

Opened a PMR with IBM, they preferred customers no longer use
RCLSTG.

Except when recommended [by IBM] as part of a recovery action for an issue with an understood origin. That is, recommended after an issue for which the reclaim is _known_ /should effect/ the required recovery or some otherwise desirable effects. And after which, if the documented effects of having performed the reclaim are not the result, then the reclaim storage feature has an apparent or obvious defect\deficiency.

Although originally in the aim to reduce customer usage of that feature [hardly new BTW], I used to suggest customers still should use reclaim *if* a message had explicitly directed them to run that request. But that was only with regard to the *DBXREF, and always with warnings that the QDBSRVXR* jobs needed first to be validated for functionality and that repeated incidents for that same messaging were a blatantly obvious indication either of some defect requiring a preventive fix or an abnormal issue requiring preventive recovery action. For example, after a CPF32A1, there was little choice but to reclaim the DBXREF data... and although its likely origin [power loss with hard failure per no UPS] was something for which a full reclaim might find and correct some other issues, those other issues might easily pend recovery indefinitely, whereas the errors with the cross-reference were likely to cause other failures very soon if not already.

Do you have any info on this?

Primarily, because the Reclaim Storage is a long-running operation that requires dedicated\restricted-state operation, intended for its specific recovery effects rather than for maintenance; i.e. not issued for want of correction, simply for lack of knowing what else to do with\for a particular problem that was encountered. Secondary, much of what can be accomplished with a full reclaim can be accomplished with other means... because it is a recovery, the recovery effects can be more directly reactive or planned\scheduled, rather than effected with the heavy-handed and thorough but very lengthy processing of the full reclaim.

One issue IBM faced was that many customers were issuing the command as part of their /normal maintenance/ instead of using the feature for the intended purpose of recovery from certain types of abnormal failures. As such customers grew or changed their business computing in various ways, the RCLSTG would often no longer be possible within an SLA. Another issue was that already-large systems for which a legitimately encountered abnormal failure would greatly benefit from a reclaim, the outage required to effect that recovery was already too large an impact. And finally there was an impression sometimes, that the service\support [not just IBM] might have used a recommendation of reclaim as a convenient way to push back on an issue; e.g. the reclaim was suggested /possibly/ to fix the issue, the reclaim was done, but the problems typically persisted, because of course the reclaim feature had nothing to do with nor the capability to correct the issue, so the customer is no closer to recovery and much further-on in time. As the false-panacea, the /reclaim storage/ had in some ways paralleled with the PC for either of its /reboot/ and /defragment/ activities, but with conspicuously harsher impact.

That the request would be recommended or presumed preventive or corrective of that which it was not, and that customers would use the request so often, all while likely causing great impact with so-often little gain relative to the cost, made the system look problematic, archaic, or whatever other negative term might describe taking offline their one scale-up vs one of many scale-out systems. The only hope was to discourage its use, so...

Over time, great effort had been made to reduce the requirements ever to use the command implementing the long outage; i.e. specifically, the RCLSTG SELECT(*ALL) OMIT(*NONE). Some sub-processing like both of the /directory/ and the /database cross-reference/ were made available as separate paths to run their function alone and made optional to reduce overall time by their omission. Additionally there was Reclaim Objects by Owner (RCLOBJOWN), Reclaim DB Cross-Reference (RCLDBXREF), Reclaim Object Links (RCLLNK). There may have been improvements to the Reclaim Library (RCLLIB) to do something more than the effectively-nothing it had done [best I could infer], although I doubt that... as it only just recently was corrected IIRC to handle a damaged *OIRSPC, although I recall no indication of what the new effect was.