Powercut of two racks at FFM2 Tuesday 5th October 2021 22:37:02


Two racks at our FFM2 location are currently without power.

Power has been fully restored since around 01:00 am (CEST). We're still working on checking all affected ~60 servers.

We are still working on restoring all services, most affected servers are back online. At 10:00:45 pm (CEST) our monitoring reported connection issues for two bladecenters, which are in different racks. These racks are neighbors and share one Juniper EX 48-Port switch, which provides connectivity for around 20 rack servers (ilo+network) as well as connectivity for the management modules of both bladecenters in these two racks. Our monitoring reported both bladecenters as unreachable.

Following the monitoring alert, our oncall duty started investigating at 10:07 pm (CEST). Our oncall duty misinterpreted the monitoring alerts, which was then first announced as outage of two racks, infact only one rack was not longer powered because at least one fuse tripped. At 10:27 pm (CEST) a technican of the datacenter operator started investigating the issue, which resulted in first servers starting back up at 11:15 pm (CEST). At 11:33 pm (CEST), we got alerted about another disruption in the neighbor rack, which was previously reported in this incident as offline. The neighbor rack hosts a colocation customer as well as BLC06, both had a power loss from 11:33 pm (CEST) till 11:51 pm (CEST), which it's reason is still investigated.

Our oncall duty as we all as remote staff is currently checking all affected servers.

The issue was caused by a PDU failure, on A-Feed, which caused a overload situation on the B-Feed. A-Feed is currently still unavailable and the onsite datacenter technican is working on provisioning a new power feed. Our own oncall duty is on it's way to the datacenter to support the onsite technican.