As of 02:27 UTC / 18:27 PST we're investigating the following service incident:
We received several reports from some of our customers located in our US West Coast that they were having trouble accessing their accounts.
03:09 UTC / 19:09 PST:
Shortly after the issue was reported our team detected the root cause and immediately resolved the issue. A Post Mortem will be available shortly.
A database host located in our US West Coast datacenter went into a Kernel Panic state and became unresponsive approximately at 2:10 AM UTC on Sunday, 2015-11-22. Customers whose accounts were affected by the failure were impacted for approximately 35 minutes and reported loss of application functionality (particularly, all write functionality to the affected database cluster was unavailable).
The duration of this incident was longer than expected as our Database Admin team had to be manually alerted by a member of our Operations team. Once alerted, Database Admin were able to promptly restore the availability of the affected database cluster. Correcting the flaws in the alerting system that led to the team not being automatically alerted, upon the Kernel Panic incident setting in, could reduce the downtime further, though not significantly.
However, additional steps to address Kernel Panic incidents on the server are being planned for the long term. These include collecting and analyzing crash data leading to the kernel panic and filing potential bug reports appropriately with the respective vendors.
Some helpful info regarding Kernel Panic (source: Ubuntu forums):
"The main point in the whole kernel panic is to protect your computer. The kernel freezes not only because it failed to do something, but also in order to prevent your computer from f.e. overheating, hard drives corruption, and any other hardware problems, that may occur, if some incorrect orders are executed, of a module (for example a module responsible for controlling the fan) failed to load, etc. This is why the kernel prefers to freeze, than to overcome the problem."
FOR MORE INFORMATION
Please subscribe to this article for regular updates until the issue is resolved. If you aren't subscribed to our Twitter feed, we encourage you to do so in order to get the most current information about any service issues. We also record all site outages on our system status page where you can see the past 12 months of service uptime. If you have questions about this issue, please open a ticket with us by sending a note to firstname.lastname@example.org.