Facebook, along with Instagram and WhatsApp, is back online after being offline for several hours late yesterday.
The The company blamed the problem on a « wrong configuration change » Within its network infrastructure which had a « cascading effect » that led to the company’s platforms being « downtime ».
Here is a closer look at the incident.
Just before 5pm, people started noticing that they couldn’t access Facebook or other services he owns and works with like Instagram and WhatsApp.
It will take more than five hours before the service begins to return.
Outages on major platforms are not uncommon, but of such an unusual length, and it became clear that Facebook was struggling to fix the problem.
Meanwhile, other platforms like Twitter and messaging app Signal have seen huge spikes in traffic as people turn to them to get back online, with some Twitter users even reporting issues at one point as the platform was strained under the weight of the sudden explosion of additional users.
By late evening, access to Facebook and Instagram was back for most users, while WhatsApp said it was back to work « at 100% » as of 3.30am this morning.
What is the cause of the problem?
In a statement, Facebook said the problem was caused by a configuration change in the « core routers » that coordinate traffic between the company’s data centers. This has caused the cascading effect which has led to a decline in the company’s various services.
The company has not yet provided any further idea on what exactly caused the problem or how to fix it.
But, web infrastructure and security firm Cloudflare gave a detailed breakdown of the incident as he saw it unfold, and said it revolves around two key mechanisms that make the Internet work – the Domain Name System (DNS) and the Border Gateway Protocol (BGP).
In essence, DNS is the address book and BGP is the roadmap of the Internet, helping people navigate the vast network of connected networks that make up the Internet to help them find the website they want and then the fastest route to it.
Cloudflare said Facebook told the BGP, through a series of updates on Monday that it seems like a coincidence, that the tracks for everything Facebook runs no longer exist — meaning that people can no longer find a way to access the social network.
Experts said this was likely caused by a software bug in the updates or human error, although some have suggested that Facebook has not ruled out an error in its statement – however, there is currently no evidence to suggest that this is the case. .
Why did it take so long to fix it?
The problem appears to have not only removed social media platforms, but also disrupted everything Facebook runs, including its internal systems — with reports of employees being blocked from offices due to broken keycard entry systems connected to the internet, and a lack of It also enables them to access it. their internal communications platform.
As a result, it was difficult for staff to diagnose and coordinate to solve the problem at first.
There have been reports in the US of Facebook having to send a team to one of its data centers to manually reset servers to fix the problem.
One expert also noted that ongoing social distancing measures due to the pandemic and remote working may also have played a role.
Software testing expert Adam Leon Smith of BCS, the Chartered Institute of Information Technology said: “It is unlikely that the problems are caused by people working from home, but it is very likely that it will take a long time to restore service due to reduced staffing within the data center.
« This would exacerbate the problem because the nature of the failure meant that remote access to the data center was also not available. »
Can anything be done to prevent this from happening again?
This latest incident, after major outages associated with Cloudflare in 2020 and Fastly earlier this year, will once again highlight the potential problems with having large parts of the internet reliant on only a few large companies and where one small problem can lead To bring down huge segments of online services.
There are currently no clear solutions to this, but this latest outage is likely to reignite the debate over Internet infrastructure.
For many individuals and businesses as well, the incident showed how much they relied on Facebook and its services not only to communicate, but also to log into other platforms.
In response, people have been encouraged to consider using credentials other than their Facebook login details to access other online services.
« Spécialiste de la télévision sans vergogne. Pionnier des zombies inconditionnels. Résolveur de problèmes d’une humilité exaspérante. »