The Facebook Outage – Whalebone’s Experience
Monday’s widely reported outage at Facebook affected not only the eponymous platform, but also its WhatsApp, Instagram, and Oculus products and managed to cut them off from the internet completely, for several hours at a cost of hundreds of millions of dollars in advertising revenue for the firm. It also created a certain amount of frustration, panic, and not a few wild theories among users of the four networks.
Originally reported as a DNS issue, the root cause was eventually determined to be a faulty internal update to Facebook’s Border Gateway Protocol (BGP). To put it briefly, BGP is essentially the Internet’s navigation system, with each network (like Facebook) advertising its location so that traffic can be routed to it. It works hand in hand with DNS – sometimes called “the Internet’s phone book” to move traffic around and make sure that we can all seamlessly access information, cat memes, and of course, Facebook when we want.
This BGP update caused a knock-on effect for DNS resolvers worldwide. Because Facebook was essentially not broadcasting its location on the internet, requests to reach Facebook’s DNS returned failure messages. Because requests often are repeated when a failure is received, requests were made – in some cases – up to 20 times before the DNS resolver stopped making the request. Combine this with “standard user behavior” (not accepting an error and closing/relaunching the Facebook or Instagram apps), and the prevalence of integrations with Facebook and other, unrelated sites created a huge increase in requests for Facebook’s DNS – which in some cases, crashed third party networks due to the volume of requests and responses. This had an effect similar to a malicious DDoS attack.
At Whalebone, we care strongly about the security of both your network and your users’ devices, whether it be from some type of malicious activity or unintended outage like the one we saw on Monday. We were immediately aware of the issue, not only from the news but from the spike in traffic that we saw through our resolvers. Though we never reached a level of traffic that would cause a delay in our customers’ ability to access the internet, we continued to monitor it, ready to take action should an issue arise.
In the graph below, you can see an example of what we saw in our DNS requests during the outage. Red represents the failures that resulted from the outage, and the green represents the successful requests.
To learn more about how Whalebone helps your users browse without interruption – either from security or from unintended outages – schedule a demo call with our sales representatives.