PDA

View Full Version : We have active and retired electricians here- how would you like to be



Curt Harms
12-31-2015, 10:17 AM
in this sparky's shoes? Not completely the electrician's fault but we know who got probably got blamed.




.................................................. ...........
The existing facility sported a trio of 220KvA UPSes, but at the new bit barn it was decided to run with just two.
“The decision was made to save costs by relocating one of them to the new building rather than buying all new equipment.”
JF says he “begged the business to call a complete shutdown to remove the UPS. They asked me what the odds of something going wrong, and I made the error of trying to provide an accurate estimate of the risk by saying there was about a one in 100 chance of problems.”

JF thought a one per cent risk of power failure across 25,000 square feet packed full of server racks, live, in production, would scare off the bean counters.
He was wrong.
“The business figured that was a perfectly acceptable risk,” he recalls. So JF decided to come in on the Saturday of the move, just in case.
“I was sitting at a desk in the middle of the data center floor that weekend when the electricians began the delicate work of removing a 220KvA UPS unit from the mains,” JF wrote. “They put the system in bypass mode without a problem. They then cut the output breakers for the units to be removed. No problems. Then they wanted to isolate the inputs for the UPS units.”
Bad idea, because “they cut the input breaker for the master electrical panel, not the outputs that went to the UPS systems. That master panel also supplied the circuits to the bypass feed, which meant we had no power to anything at all.”
“25,000 square feet suddenly went silent. I ran into the electrical room expecting to find bit of dead electrician all over the place, but they were just calmly disconnecting wires.”
“I yelled, 'We're down!' and they said, 'No, we're in bypass mode.' I repeated, 'Noooo. We are down". They paused for 10 seconds and then their eyes got really wide.”
.................................................. ........



http://www.theregister.co.uk/2015/12/11/electrician_cuts_the_wrong_wire_and_brings_down_25 000_square_feet_of_data_centre/

Scott T Smith
12-31-2015, 11:43 AM
IMO It is clearly, 100% the electricians fault for shutting down the master breakers instead of the UPS input breakers.

In the event that there was no way to isolate the UPS w/o turning off the master breakers, it is still the electricians fault for not informing management that there would be a 100% chance of failure if they relocated the UPS.

My 2 cents....

Dan Hintz
12-31-2015, 3:15 PM
IMO It is clearly, 100% the electricians fault for shutting down the master breakers instead of the UPS input breakers.

In the event that there was no way to isolate the UPS w/o turning off the master breakers, it is still the electricians fault for not informing management that there would be a 100% chance of failure if they relocated the UPS.

My 2 cents....

Agreed, 100%. When you work on a mission-critical room, you don't guess at what's going to happen, you KNOW beforehand.

Kev Williams
12-31-2015, 3:46 PM
... Just kidding! :)


http://www.engraver1.com/erase2/kidding.jpg

Myk Rian
12-31-2015, 5:37 PM
Similar instance at Ford Credit in Dearborn. They disconnected the UPS in the South room for maintenance, which also incorporated some welding.
Well, they forgot to put the smoke/fire alarm in bypass. The welding smoke shut the entire 75,000 sf server room down. It's a good thing I was in there, as nobody noticed it. When I told them all the servers died, they looked like a bunch of scared rats running around trying to get things back up and running.

Chris Hachet
01-01-2016, 10:55 AM
One hundred percent the electricians fault.

glenn bradley
01-01-2016, 11:08 AM
If they deviated from the change script (MOP, CCD, whatever your term) approved by the data center manager, then it is the electrician's fault. If they followed the approved script, it is the data center manger's responsibility. The DC manager doesn't have to be an electrician but, he/she should vet any process recommended with at least two qualified sources. If you are not diligent, you pay the price. Welcome to today's version of data centers; generally sloppy, poorly managed and undeservedly immodest; but, that's just one darn fool's opinion :).

Jason Roehl
01-02-2016, 8:46 AM
Hmm...that makes my recent (mis)adventure of shutting off a breaker that powered a UPS in a courtroom--to which the court recorder's computer was attached--seem pretty benign. The UPS kept the computer up just long enough for the hearing to begin... Thankfully the judge seems to have gotten over it, but the look on her face definitely went downhill the longer it took for that computer to boot back up.

Ryan Mooney
01-02-2016, 11:51 AM
The purpose of post-mortems is not to assign blame, the purpose is to ascertain which part of the process was lacking and how to fix it. Assigning blame leads to scapegoating which is a poor remediation strategy (having said that while one oops is education perhaps well paid for repeated oops of the same variety and willfully bypassing MOPs is grounds for goodbye).

Having said that I half agree with Glenn, if they followed the MOP (/CCD/whatever) the MOP was defective. However imho even when the MOP is defective people are still expected to have good sense and recognize deficiencies in the MOP and take on site corrective action otherwise you end up with the same problem of truckers following GPS's past "one way, to narrow" signs. People blindly following MOPs without thinking about the actual situation frustrate me to no end (yes I know the MOP said that, you're still an idiot).

Brian Elfert
01-03-2016, 5:08 PM
Our big centralized UPS my employer used to have failed one day and automatically went into bypass mode. One of our in-house electricians looked at it and decided to switch power back to the UPS even though nothing had been fixed on the UPS. As soon as he switched back to UPS power the entire data center went dark because the UPS was not supplying any power due to an internal failure.

The trend today is to use multiple smaller UPS units instead of one huge one like we used to have. Most data centers will feed everything with power from two UPS systems for further redundancy.