• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

My TechDecisions

  • Best of Tech Decisions
  • Topics
    • Video
    • Audio
    • Mobility
    • Unified Communications
    • IT Infrastructure
    • Network Security
    • Physical Security
    • Facility
    • Compliance
  • RFP Resources
  • Resources
  • Podcasts
  • Subscribe
  • Project of the Week
  • About Us
    SEARCH
IT Infrastructure, Network Security, Physical Security

What Happened With Facebook’s Outage?

A faulty configuration change led to outages across Facebook, Instagram and WhatsApp for at least six hours on Monday, Oct. 4.

October 6, 2021 Alyssa Borelli Leave a Comment

Facebook outage
Chinnapong/stock.adobe.com

A faulty configuration change led Facebook, Instagram and WhatsApp to all go down for at least six hours on Monday, October 4. Those trying to reach the social media platforms were met with browsers and apps displaying DNS errors on connection attempts.

The routing prefixes suddenly disappeared from the internet’s border gateway protocol (BGP), a routing protocol that makes the internet work and makes it possible for devices from around the world to communicate with each other.

Since Facebook’s domain and DNS record are hosted on the company’s own routing prefix, when the BGP prefixes were removed, no one could connect to the IP addresses or services running on top of them.

During one of the routine maintenance jobs, a command was issued with the intention to assess the availability of global backbone capacity, which unintentionally took down all the connections, effectively disconnecting Facebook data centers globally prompting a wide outage.

Read: You Need To Look Out For These Software Vulns

According to a Facebook blog post, its systems were designed to take audit commands like this to prevent mistakes, but a bug in the audit tool prevented it from stopping the command.

The total loss of connection made things worse for Facebook — engineers working on trying to figure what went wrong couldn’t access the data center through normal means because the networks were down and the total loss of DNS broke many of Facebook’s internal tools to investigate and resolve outages like this.

Engineers were sent onsite to the data centers to debug the issue and restart the systems, however it took time because data centers are designed with high levels of physical and system security in mind.

Once the backbone network connectivity was restored, Facebook feared a surge in traffic, which could have caused a dip in power consumption and could have put the electrical system to caches at risk. They had to slowly flip services back on.

While Facebook continually stress tests its systems, the company never tested its global backbone being taken offline. “In the end, our services came back up relatively quickly without any further systemwide failures. And while we’ve never previously run a storm that simulated our global backbone being taken offline, we’ll certainly be looking for ways to simulate events like this moving forward,” said Santosh Janardhan, VP of Infrastructure at Facebook in a blog post.

“We’ve done extensive work hardening our systems to prevent unauthorized access, and it was interesting to see how that hardening slowed us down as we tried to recover from an outage caused not by malicious activity, but an error of our own making,” he says.

“I believe a tradeoff like this is worth it — greatly increased day-to-day security vs. a slower recovery froma hopefully rare event like this. From here on out, our job is to strengthen our testing, drills, and overall resilience to make sure events like this happen as rarely as possible,” said Janardhan.

Tagged With: border gateway protocol, Data Center, DNS, Facebook, infrastructure, storm drills, stress testing

Related Content:

  • ScreenBeam Logo ScreenBeam Invites K-12 Institutions to Apply for Wireless…
  • 1E Patch Insights, Patch Management, Software update 1E Releases Patch Insights to Augment Microsoft Patching…
  • Google AI Investment, Anthropic, OpenAI, ChatGPT Google Makes Key AI Investment as Microsoft Begins…
  • Xilica Sennheiser small room kit Xilica, Sennheiser Add Small Room Audio Kits for…

Free downloadable guide you may like:

  • Harnessing the Power of Digital SignageHarnessing the Power of Digital Signage

    Choosing the best solutions for messaging, branding, and communicating in today’s content-everywhere landscape

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Get the FREE Tech Decisions eNewsletter

Sign up Today!

Latest Downloads

Harnessing the Power of Digital Signage
Harnessing the Power of Digital Signage

Choosing the best solutions for messaging, branding, and communicating in today’s content-everywhere landscape

Blueprint Series Cover: What works for hybrid work
Blueprint Series: What Works for Hybrid Work

Download this free resource to learn about how IT leaders can effectively manage and implement a hybrid work model.

Guide to creating a ransomware response plan download
Blueprint Series: Creating a Ransomware Response Plan

Chances are ransomware hackers are researching your company right now. They’re investing time and money to choose the most profitable targets and a...

View All Downloads

Would you like your latest project featured on TechDecisions as Project of the Week?

Apply Today!
Sharp Microsoft Collaboration HQ Logo

Learn More About the
Windows Collaboration Display

More from Our Sister Publications

Get the latest news about AV integrators and Security installers from our sister publications:

Commercial IntegratorSecurity Sales

AV-iQ

Footer

TechDecisions

  • Home
  • Welcome to TechDecisions
  • Subscribe to the Newsletter
  • Contact Us
  • Media Solutions & Advertising
  • Comment Guidelines
  • RSS Feeds
  • Twitter
  • Facebook
  • Linkedin

Free Technology Guides

FREE Downloadable resources from TechDecisions provide timely insight into the issues that IT, A/V, and Security end-users, managers, and decision makers are facing in commercial, corporate, education, institutional, and other vertical markets

View all Guides
TD Project of the Week

Get your latest project featured on TechDecisions Project of the Week. Submit your work once and it will be eligible for all upcoming weeks.

Enter Today!
Emerald Logo
ABOUTCAREERSAUTHORIZED SERVICE PROVIDERSTERMS OF USEPRIVACY POLICY

© 2023 Emerald X, LLC. All rights reserved.