01:24 UTC | 18:24 PT
Service incident on pods 5,6,9 has been resolved. Most customers impact ended at 11:32am PDT.
20:35 UTC | 13:35 PT
We are stable for all but a few customers and continue to work towards full resolution. We will update when new information is available.
19:55 UTC | 12:55 PT
The majority of our customers with stability issues on Pods 5, 6, and 9 are seeing improvement. We are monitoring the situation
19:19 UTC | 12:19 PT
We continue to see steady improvement to the stability for our Pod 5, 6, and 9 customers. Please let us know if you are still having issues
18:49 UTC | 11:49 PT
We are continuing to see improvement on pods 5, 6, and 9 and are working towards a full resolution for the stability issues
18:15 UTC | 11:15 PT
We are seeing improvements in stability in pods 5, 9 and now 6. We are continuing to monitor the situation and will update as we know more
17:40 UTC | 10:40 PT
We are seeing service improvements in pods 5 and 9. We are still investigating the issue and will update shortly.
17:16 UTC | 10:16 PT
We continue to investigate service issues in pods 5, 6, & 9. Affected accounts may experience delays of up to 10 minutes on outbound emails.
16:57 UTC | 09:57 PT
We're investigating performance and availability issues in pods 5, 6, and 9. This is causing latency in all apps and dropped calls in talk.
The trigger for this service incident was a spike in network traffic that began at 9:33AM PDT that exceeded the capacity of our datacenter firewall for Pods 5, 6 and 9. To resolve the incident, we restricted connections on pods 5,6, & 9, allowed the firewall to recover, and gradually re-introduced each Pod to ensure the service remained stable. In post-mortem investigation, we identified an immediate need for network infrastructure/capacity improvements to be deployed this week (including by emergency maintenance this coming weekend). We also deployed a change to Help Center caching that reduces the risk of similar traffic spikes in the future and we identified several diagnostic tools needed to help us identify and respond to unexpected traffic spikes before they have a significant customer impact.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.