UPDATE: Fastly blames software bug for major global internet outage

0
410

Fastly Inc, the company behind a major global internet outage this week, said on Wednesday the incident was caused by a bug in its software that was triggered when one of its customers changed their settings.

Tuesday’s outage raised questions about the reliance of the internet on a few infrastructure companies. Fastly’s issue knocked out high traffic sites including news providers such as The Guardian and New York Times, as well as British government sites, Reddit and Amazon.com.

“This outage was broad and severe, and we’re truly sorry for the impact to our customers and everyone who relies on them,” the company said in a blog post authored by Nick Rockwell, its senior engineering and infrastructure executive.

He said the problem should have been anticipated.
Fastly operates a group of servers strategically placed around the world to help customers move and store content close to their end-users quickly and safely.

The company post gave a timeline of events and promised to examine and explain why Fastly had failed to detect the software bug during its own testing process.

Fastly said the bug was in a software update shipped to customers on May 12 but was not triggered until one unidentified customer carried out settings changes that triggered the problem “which caused 85% of our network to return errors.”

Fastly noticed the outage within a minute it occurring at 0947 GMT, and engineers worked out the cause at 1027 GMT. Once they disabled the settings that triggered the problem, most of the company’s network quickly recovered.
“Within 49 minutes, 95% of our network was operating as normal,” the company said.

Its networks were fully recovered at 1235 GMT and it began rolling out a permanent software fix at 1725 GMT, Fastly said.

 

————————————————————————————————————————————————————

PREVIOUS UPDATE: Websites back online after Internet glitch

WASHINGTON: Thousands of government, news and social media websites across the globe were coming back online yesterday after getting hit by a widespread hour-long outage linked to US-based cloud company Fastly.

High traffic sites including Reddit, Amazon, CNN, Paypal, Spotify, Al Jazeera Media Network and the New York Times went down, according to outage tracking website Downdetector.com. They came back up after outages that ranged from a few minutes to around an hour.

“Our global network is coming back online,” Fastly said.

One of the world’s most widely-used cloud-based content delivery network providers, the company earlier reported a disruption from a “service configuration” and did not explain.

Fragmented

“Incidents like this underline the fragility of the Internet and its dependence on a patchwork of fragmented technology. Ironically, this also underlines its inherent strength and how quickly it can recover,” Ben Wood, chief analyst at CCS Insight said.

“The fact that an outage like this can grab headlines around the world shows how rare it is.”

Fastly, which went public in 2019 and has a market capitalisation of under $6 billion, is far smaller than peers like Amazon’s AWS. The company’s content delivery network (CDN) helps websites move content using less-congested routes, enabling them to reach consumers faster.

“In the grand scheme of things, we actually think that this is a little bit of a positive for other CDNs and also just shows how difficult managing a CDN can be,” James Fish, analyst at Piper Sandler & Co, said.

Apart from Fastly, the other main CDN providers include Akamai Technologies, Cloudera and AWS.