|
On Wed, 30 Aug 2000, Shaw, David wrote: > >Yes, my other machines (AIX and linux) do run well with very nearly full > >disks. > > Don't those OS's have their virtual memory swap spaces pre-allocated on > disk? That makes a machine much more tolerant of having nearly full disks, > since some of the "fullness" is actually empty space waiting to be used. > Undersize your swap space, though, and you have the same kinds of problems > when it fills up. The /400's virtual memory model is a lot more flexible, > but does need space to work in, and doesn't provide a way to "hide" it in a > pre-allocated space. So then does the need to IPL come from an out of memory situation? Even an OS that uses pre-allocated swap space can use up all available memory. A graceful (non-rebooting) solution should exist. Not that a graceful solution necessarily exists on the above platforms (my linux box has died once in an OOM situation, not sure about AIX or newer linux kernels). > >the reasons people shut them down. It surprises me that an IPL was a > >requirement to simply get some disk space back. > > It actually isn't a requirement. I think the reason that the machine does > it on its own when it maxes out is because the designers found it to be the > simplest way to: > > 1) stop the (perhaps unknown) process(es) filling the space, and > 2) get back space from temporary objects, QTEMP libraries, and virtual > memory so that the machine can run normally again. > > I caught a system once at 99.98%, and managed to stop the job filling it up > and recover the disk space without an IPL (job was doing saves to save files > in QTEMP - without compression). It was at the point where no one could > sign on - attempts to do so would hang. Fortunately I knew which job it had > to be and was able to kill it at the console, which we left signed on to > QSYSOPR at all times. This is a perfect example of where I think a solution is needed. Why should one run away job be able to bring down the system? Why doesn't the OS issue a 'no space left on device' error to the offending job? Why does the OS give up on swap space that it had already allocated and still needs? It seems that if the system was able to procure x amount of swap it shouldn't give up that swap until it is no longer needed. As for applications writing to disk shouldn't there be logic in the program that does something like: [write something] [check for error (like disk full)] [write something else] [check for error] [if no error] [commit] [endif] For the example you gave above shouldn't the OS issue an error to the save job at which point the save job would die (hopefully cleaning up after itself), leaving the system usable? James Rich james@dansfoods.com +--- | This is the Midrange System Mailing List! | To submit a new message, send your mail to MIDRANGE-L@midrange.com. | To subscribe to this list send email to MIDRANGE-L-SUB@midrange.com. | To unsubscribe from this list send email to MIDRANGE-L-UNSUB@midrange.com. | Questions should be directed to the list owner/operator: david@midrange.com +---
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.