|
Hello Jim,
after some thinking time, I feel the urge to reply (again).
Am 30.03.2023 um 22:18 schrieb Jim Oberholtzer <midrangel@xxxxxxxxxxxxxxxxx>:
The issue of disk fragmentation was in the way back S/36 days (S/34 and
earlier). Single level storage with the advent of S/38 almost eliminated
that problem
Almost but not entirely, apparently: See Mark W.'s comment about reloading and Rob B. comment about that "it helped".
Not sure what SLS has to do with fragmentation on disk. SLS as I understand it means that primarily, data is kept on disk and paged in and out of RAM as needed. Needed means: The CPU being able to access data (and code) fast enough. RAM = cache. Mainly. Yes?
and now that we have fast access disk units it’s simply not a concern at all.
I doubt that claim. Faster (and more disks) also are burdened with more data, in turn countering the speed increases more or less. This is no different with IBM i on Power vs. common systems.
The storage management code in LIC deals with it, and that’s way smarter about storage management than we will ever be.
That code has been written by humans. Thus they must have been even smarter than (the) code ever can be for predicting how it works with the many corner cases inevitably happening.
This is my understanding how things work:
Fragmentation of data on disk is some kind of natural entropy happening to each and all disk based computer platforms while more data is added over time. The basic mechanism is artificial chopping of contiguous data (such as a database table as a logical entity) into blocks of storage on disk. Storage in blocks is used because handling blocks is more efficient than handling the minimal logical data allocation entity (one byte, or one record of a table).
Thus, a "block" of space is allocated from disk private to an object. At first, it's not filled completely, thus introducing overhead. When the object's block on disk is full, another block must be allocated. Naturally, other blocks have been allocated physically adjacent to the original block, so there is some distance involved until the next block can be allocated from free space. The result is that the logical entity is "scattered" on the physical space provided by the disks.
There are ways to counterfeit this inevitable fragmentation by
- wasting space though preallocating the complete file to an assumed maximum size,
- deliberately leave holes of free space between newly allocated objects, allowing some growth of older objects without introducing fragmentation,
- use many disks ("RAID0") to even out
- completely rewrite an object on disk when updates are required: Copy the original data into a new, contiguous allocation of space and discard the old allocation. I don't know of any OS which does that, because this would introduce
-- higher answer times, if this is done synchronously to the user waiting,
-- higher workload of disks, thus increasing storage latency and increasing wait time for all users,
-- higher risk of data loss when such reconciliation activity is run in the background while the system is idle and there's a crash through hardware or power failure.
All methods might be combined.
Now, storage management is opaque. IBM released only enough information to gain a basic understanding what's going on to allow for some optimization techniques and a lot of guesswork and assumptions. Fortress Rochester. :-) Secrets foster word of mouth.
But on the other hand, even IBM can only "cook with water".
The point behind reuse deleted records is when storage management is
writing new stuff to disk, it needs a continuous 4k block to write to. So
if it can’t find one, it moves things around to create it.
Do you have proof that this is actually happening? See above.
:wq! PoC
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.