On December 29, 2018, the Louisiana-based company CenturyLink experienced a nationwide outage that affected areas from New York to California. Minnesota customers complained their landlines were down, in Idaho, it was reported that about 30 percent of the state offices were having issues, libraries experienced internet outage, some ATM machines went down and the Idaho Lottery tweeted they could not sell or pay claims on tickets. The problem extended to Medical Centers in Colorado not being able to bring up patient records, to technical difficulties at Albuquerque, New Mexico's counties water authority facilitates.
But the biggest problem from the outage was the concern that customers were unable to call the emergency service 911 number across the 35 states that CenturyLink services.
Outage hurts 911 calls
Even other wireless providers, like Verizon, Comcast, and Bluegrass Cellular to name a few, were affected by the outage; CenturyLink helps in the handling of their data traffic. For some, the outage lasted only half a day but for most, it was almost a two-day outage as the companies engineers worked to find and sort out the problems. At the time, the Federal Communications Commission (FCC) Chairman Ajit Pai issued a statement saying the ability of consumers not being able to call on 911 emergency services was totally unacceptable.
He said an investigation would be carried out on the causes of CenturyLink's technical difficulties and the impact it had on consumers.
Widespread outage investigated
The FCC released its report yesterday on the details of the cause and impact of CenturyLink outage back in December 2018. According to ArsTechnica, It showed the outage lasted for 37 hours and was caused by a switching module failure that was caused by a network configuration error.
It affected 22 million customers in 39 states, and resulted in 886 calls to 911 being unanswered.
The first call came from a customer in the New Orleans area around 3:56 am and while handling the complaint, alarms were received from Infinera Intelligent Transport Networks (Infinera) indicating an issue with their control modules.
This is when CenturyLink realized they had a widespread outage on their hands and began investigating. Not being able to connect to nodes remotely, they realized the nodes were overloaded. The investigation revealed that a switching module in their Denver, Colorado node without prompting, initiated four malformed management packets. The answer as to why this happened is still unknown despite a deep investigation by CenturyLink and Infinera.
Working to set things right
CenturyLink did get things working again by replacing the faulty switching module and sent it to Infinera for further evaluation. The FCC's report notes that CenturyLink took steps to improve its monitoring and audits of equipment, as well as its consumer notification system. They are working to improve their nodes' Ethernet controls to do a quicker check on identifying and terminating invalid packets. The work is said to be finished by fall of this year.