MIDRANGE dot COM Mailing List Archive



Home » MIDRANGE-L » December 2012

Power outage unfun



fixed

All,

We had a power outage at work Fri 3:45am that lasted till 1pm. The outage
was discovered by the Plant Manager when he arrived at 4:30am. (The night
crew had finished loading trucks and left at 3:15am.) He got everything
opened, the drivers arrived, and hit the streets to deliver. All in the
dark.

The first thing that was really scary about this was the timing. We have a
UPS that will power all the servers for 20-30 minutes. Software on the
System i monitors the UPS and does an orderly shut down in an outage. But
that software isn't running during the backup, which requires a dedicated
system. That backup starts at 3am and is finished at 4-4:05am. With 20+
minutes on the battery we should be OK, but . . . So we had no idea what
state the System i would be in.

When I&M (Indiana Michigan Power) "repaired" the blown fuse on a pole down
the street, restoring power, they didn't look for the root cause of the
outage. Never came into our building to check things or talk to anyone.
So when power came back on, things were really strange. Brown out, lights
were flashing like strobe lights. In rhythm. Like people who have outside
Christmas lights timed to music. Never seen anything like it. At least 1
freezer compressor started to smoke. At $20,000 per freezer compressor,
that was scary. We didn't start any computers at all for fear of frying
them.

We even thought we were going to lose the UPS. It kept cycling on/off
battery with some not so nifty panel messages about power module failures.
I finally made a call to the UPS people. They had me look at the voltages
the UPS was receiving. We were getting 80-160 volts instead of 240 volts.
That explained a lot. The UPS tech arrived an hour or so later and said
he'd never seen a unit act like that. He was first afraid the main section
of the UPS was going because of the buzzing sound it made. Thought he
would have to have one couriered in overnight. After some thought, his
hunch was the low voltage was causing the issues at the UPS, so we had to
wait on the power company. We called in the electrician who does our work
and he confirmed the low voltage was an I&M issue.

I&M was no help over the phone on this. They couldn't give us any idea
when power would be restored. When another I&M line truck finally arrived
at 6pm (one guy in the truck), he found the real problem. The wires
between 2 poles across the street were too loose. The high winds that day
were causing them to touch, sending backfeeds into our building. That one
guy gave us more customer service that all the other I&M people put
together. He explained exactly what the problem was, what he was going to
do to fix it, and what the next step would be if that didn't resolve the
issue.

After he fixed that, all was well. He came back to our building to
confirm. (And he was none too happy with the crew that had been out
earlier.) We were getting 250 volts and the UPS went back to normal. We
brought up all the Windows servers and SAN and all was well there. Our
System i maintenance is with ServIT. That tech had already been there
twice during the day and was now at home. I called him on his cell and he
volunteered to come in right then, Friday night. I said let's wait until
Saturday morning when we'll both be fresh.

Saturday we brought the System i up in manual mode and everything looked
good. Brought it up normally and everything looked good. I checked the
BRMS backup log and the very last thing recorded was "Save of list *LINK
complete". I looked at the log from the previous morning's backup and the
very next thing in the log was the saving of media information. So, in
effect, we had a complete backup of the system. Hey hey. We had all the
sales people send in their Monday orders and staged on the System i.

Sunday, management and several office people came in. We ran all of
Monday's business, did the A/R, deposited Friday's collections from the
drivers in about 2 hours. We have some really good people. More people
volunteered to come in than we needed. All hourly people that came in are
getting paid more hours than they actually worked plus we have another
surprise for them in the next week or 2.

It's Monday and there are no computer issues that I have found. One
freezer compressor won't come on. The newer A/C unit in the computer room
won't work, so we're using the old A/C unit for now. Glad it's not
summertime. Electrician on the way.

A couple of years back, the CEO had me get a quote for an automatic backup
generator. Due to the expense, we didn't do it at that time. We may very
well do it now. With that generator running the office and the computers,
we could have worked a normal day Friday and saved a lot of grief. The CEO
says he is going to sell it to the board.

One question: Is there any way whatsoever that we can get a complete,
full, don't-miss-anything backup and still have that UPS software running?

All in all, last Friday was the most harrowing day I've spent here in 38+
years.






Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2014 by MIDRANGE dot COM and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available here. If you have questions about this, please contact