• Home
  • About 404TS
  • Contact

404 Tech Support

Where IT Help is Found

  • Articles
    • Code
    • Entertainment
    • Going Green
    • Hardware, Gadgets, and Products
    • Management
    • Network
    • News
    • Operating Systems
    • Security and Privacy
    • Software
    • System Administration
    • Talking Points
    • Tech Solutions
    • Web
    • Webmaster
  • Reviews
  • Media
    • Infographics
    • Videos
  • Tech Events
  • Tools
    • How do I find my IP address?
    • Browser and plugin tests
  • Get a Technical Consultation
You are here: Home / Media / Facebook Suffers Worst Outage in Over Four Years Today

Facebook Suffers Worst Outage in Over Four Years Today

2010-09-23 by Jason

Facebook, the social networking site which had just recently surpassed Google as a web destination where people spend more time, suffered an outage today that made it extremely slow or unavailable to people for almost 2.5 hours. Facebook Engineering apologized for the downtime and explained in detail the cause of the problem.

The short version:

An automated validation of data was incorrectly triggering after Facebook Engineers implemented a new invalid status. Between the validation system and caching, the system looped into a locked up state of queries and the only way to stop the cycle was to shut down the site.

The long version:

Early today Facebook was down or unreachable for many of you for approximately 2.5 hours. This is the worst outage we’ve had in over four years, and we wanted to first of all apologize for it. We also wanted to provide much more technical detail on what happened and share one big lesson learned.

The key flaw that caused this outage to be so severe was an unfortunate handling of an error condition. An automated system for verifying configuration values ended up causing much more damage than it fixed.

The intent of the automated system is to check for configuration values that are invalid in the cache and replace them with updated values from the persistent store. This works well for a transient problem with the cache, but it doesn’t work when the persistent store is invalid.

Today we made a change to the persistent copy of a configuration value that was interpreted as invalid. This meant that every single client saw the invalid value and attempted to fix it. Because the fix involves making a query to a cluster of databases, that cluster was quickly overwhelmed by hundreds of thousands of queries a second.

To make matters worse, every time a client got an error attempting to query one of the databases it interpreted it as an invalid value, and deleted the corresponding cache key. This meant that even after the original problem had been fixed, the stream of queries continued. As long as the databases failed to service some of the requests, they were causing even more requests to themselves. We had entered a feedback loop that didn’t allow the databases to recover.

The way to stop the feedback cycle was quite painful – we had to stop all traffic to this database cluster, which meant turning off the site. Once the databases had recovered and the root cause had been fixed, we slowly allowed more people back onto the site.

This got the site back up and running today, and for now we’ve turned off the system that attempts to correct configuration values. We’re exploring new designs for this configuration system following design patterns of other systems at Facebook that deal more gracefully with feedback loops and transient spikes.

We apologize again for the site outage, and we want you to know that we take the performance and reliability of Facebook very seriously.

Google might have a shot at regaining the previously mentioned milestone with this downtime and the fact that thousands of people turned to Google as soon as Facebook failed to load. As a result of the outage, Facebook took up 11 spots in Google trends this evening as the outage wrapped up.

As of writing, Facebook is back up and operational.

Filed Under: Media, News

Trending

  • Social Networks and Language Usage
    In Infographics
  • 2011 Pwnie Award Winners Announced For Achievements and Failures In Security
    In News, Security and Privacy
  • Mozilla Gets Firefox 3.6.2 Out the Door Early
    In News, Security and Privacy, Software, System Administration

Latest Media Posts

Find Out Where To Download SNES ROMs

Find Out Where To Download SNES ROMs

Multifunctional Video Conversion Tools – Wondershare Video Converter

Multifunctional Video Conversion Tools – Wondershare Video Converter

  • Popular
  • Latest
  • Today Week Month All
  • Access to the resource [servershare] has been disallowed Access to the resource [servershare] has been disallowed
  • Read the Event Logs on Windows Server Core Read the Event Logs on Windows Server Core
  • Increase IIS Private Memory Limit to improve WSUS availability Increase IIS Private Memory Limit to improve WSUS availability
  • How to ‘Unblock’ multiple files at a time with PowerShell How to 'Unblock' multiple files at a time with PowerShell
  • Setup your DFS namespace with DNS for compatibility in a mixed environment Setup your DFS namespace with DNS for compatibility in a mixed environment
  • How Virtual Reality Supports Mental Health Therapy How Virtual Reality Supports Mental Health Therapy
  • Key Strategies of Successful Coin Listing on Exchange Key Strategies of Successful Coin Listing on Exchange
  • Keeping Your Mac Healthy: A Comprehensive Guide to Maintenance and Troubleshooting Keeping Your Mac Healthy: A Comprehensive Guide to Maintenance and Troubleshooting
  • Making Distributed Software Development Work: Strategies and Best Practices for Managing Remote Teams Making Distributed Software Development Work: Strategies and Best Practices for Managing Remote Teams
  • customer contactless payment for drink with mobile phon at cafe counter bar,seller coffee shop accept payment by mobile.new normal lifestyle concept The Latest Innovations In Payment Technology
Ajax spinner

Elevator Pitch

404 Tech Support documents solutions to IT problems, shares worthwhile software and websites, and reviews hardware, consumer electronics, and technology-related books.

Subscribe to 404TS articles by email.

Recent Posts

  • How Virtual Reality Supports Mental Health Therapy
  • Key Strategies of Successful Coin Listing on Exchange
  • Keeping Your Mac Healthy: A Comprehensive Guide to Maintenance and Troubleshooting

Search

FTC Disclaimer

404TechSupport is an Amazon.com affiliate; when you click on an Amazon link from 404TS, the site gets a cut of the proceeds from whatever you buy. This site also uses Skimlinks for smart monetization of other affiliate links.
Use of this site requires displaying and viewing ads as they are presented.

Copyright © 2025 · Magazine Pro Theme on Genesis Framework · WordPress · Log in