Colleagues, there are a lot of good points brought up all over.
We often tout IBM i because of the ability to break stuff down into
subsystems to separate load, and risk. Risk? For example, I may have
Domino partition DOMINO01 running in subsystem DOMINO01 serving up email,
while DOMINO02 controls LDAP. If I do something that requires that I
bounce DOMINO02 then I do so, without disrupting numerous email users.
This is nice. But that all runs in one lpar. I even have utilities by
IBM to handle hanging shared memory and semaphores, if I need it (like
DLTDOMSMEM and DLTDOMSEM). However, there are times you still may need to
IPL or run SEVERELY crippled. Have you ever done something so incredibly
stupid like, couldn't stop/start a printer writer so you tried to end the
subsystem? Only to find out that instead of just one writer being hung
you now have them all hung and cannot restart the subsystem until that one
writer ends and it's not going to end, even with ENDJOBABN? Technically,
I don't have to IPL. I just cannot print. Which makes you about as
popular as a turd in the punchbowl. So segregation by whole lpar has it's
place. Then you get down to critical hardware upgrades. Like, I want to
upgrade firmware so I can use card such and such which requires that level
of firmware. (Bad example, many firmware upgrades are not disruptive.) Or
I need to upgrade the OS on my SAN. So there might be advantages to not
only separate lpars, but separate boxes.
The problem with too much separation is it really increases the
complexity. It also increases your number of places that something can go
wrong. And any weak link can break the whole chain. (Did you test
bringing APPLIANCE1 down while SERVER1 stayed up?) Another problem is
thinking we have all this redundancy that we don't need to give each link
of the chain due diligence. And that's what fails, or gets hacked, or ...
Multiple stuff can really increase cost. 27 boxes, all running Mimix to
other boxes can be a much higher cost than one bigger box running Mimix to
one other box.
We had multiple divisions, in different time zones, running their own
AS/400's (been awhile). We kept them up on PTFs and OS and it was a bear.
Users loved it because they could run period end when they wanted to and
didn't have to get permission from other divisions. However, their
controller was a tight fisted <expletive deleted>. He let the machine
fall behind on hardware and performance got to be a dog. When he ran
remotely on our machine it was actually faster. Another example, one
controller didn't want the downtime of OS upgrades. He finally consented
when we said, "we have your source, we can no longer compile down to your
level of the OS, upgrade or die". All of these controllers are gone.
Everyone is on 7.1 and were within 6 months of 7.1 GA.