As of 10:40 AM GMT / 02:40 AM PST we're investigating the following service incident:
We are investigating Voice & Latency issues affecting some customers. More information shortly.
11:02 AM GMT / 03:02 AM PST
We are continuing to investigate performance issues, mainly affecting Voice and email for some customers. Apologies for the inconvenience.
11:19 AM GMT / 03:19 AM PST
Our Operations team are actively investigating the ongoing performance issues with Zendesk services. We will update with new info ASAP.
11:57 AM GMT / 03:47 AM PST
We continue to investigate the ongoing performance issues with Zendesk services. We will provide more information as soon as possible.
12:26 PM GMT / 04:26 AM PST
Our operations team continue to investigate the service issues affecting some customers.
12:59 PM GMT / 04:59 AM PST
We are making progress with the investigation regarding service issues affecting some customers. More information ASAP.
13:34 PM GMT / 05:34 AM PST
We are beginning to see improvement with the service issues affecting some customers. More info ASAP.
14:05 PM GMT / 06:05 AM PST
We continue to see improvement with Zendesk services. Additional updates to follow as we progress to full resolution.
14:13 PM GMT / 06:13 AM PST
Some customers may experience delays with inbound/outbound email as our mail queues clear.
14:46 PM GMT / 06:46 AM PST
We are happy to report the service incident affecting some customers is now resolved. Post-mortem to follow as soon as information is available.
This incident affected our data centers located on the East Coast of the US and resulted in a widespread outage affecting multiple services and customers. The issue was caused by a customers integration with their Zendesk instance and their mobile application which due to a misconfiguration, generated thousands of requests per second for this specific account. This volume of requests overwhelmed our firewall and in turn prevented customers from being able to access their accounts that were based in the affected data centers.
Our Operations team performed multiple investigative steps in working to identify the issue once it was reported. The uniqueness of the situation was a contributing factor into the length of the incident itself as it took time to determine the cause was from a sole unintentional abuse of a customer account and not from malicious activities. Once the issue was found, actions were taken to block and segregate the customers traffic until the cause of their issue could be solved and in turn services started return to a normal state.
Moving forward, we are working to implement better monitoring at the firewall level to gain better visibility if an issue similar in nature occurs again. We are also working to improve rate limiting due to misconfigurations of mobile applications to not allow large volumes of traffic to be indirectly created.
FOR MORE INFORMATION
Please subscribe to this article for regular updates until the issue is resolved. If you aren't subscribed to our Twitter feed, we encourage you to do so in order to get the most current information about any service issues. We also record all site outages on our system status page where you can see the past 12 months of service uptime. If you have questions about this issue, please open a ticket with us by sending a note to email@example.com.