19:29 UTC | 11:29 PT
Performance issues affecting Pod 5 have now been resolved.
18:41 UTC | 10:41 PT
We are continuing the investigation of issues on Pod 5. Twitter, Facebook, and Channel Framework (incl. Google Play) may still be impacted.
18:05 UTC | 10:05 PT
We're continuing to investigate issues affecting Pod 5. Channels may still be impacted. More information as the we uncover more details.
17:36 UTC | 09:36 PT
Talk performance has continued to stabilize on Pod 5 as we continue investigating the root cause.
17:36 UTC | 09:36 PT
Twitter, Facebook, and Channel Framework (including Google Play) may be impacted in Pod 5. We are actively working to resolve the issue.
17:02 UTC | 09:02 PT
We are beginning to see improvements with Talk performances issues on POD5. We continue to investigate and monitor this issue.
16:34 UTC | 08:34 PT
We are currently still monitoring P5 performance issues. There have been some reports of Talk issues - we are continuing to investigate.
15:25 UTC | 07:25 PT
We are continuing to monitor POD5 closely, until full resolution. Service is stable at this time.
13:56 UTC | 05:56 PT
The service on POD5 is stable. We are monitoring closely, until full resolution.
12:33 UTC | 04:33 PT
Performance on POD5 is currently stable, but we are continuing to monitor the situation.
11:38 UTC | 03:38 PT
Performance on POD5 is stable, however we are continuing to monitor the situation.
11:11 UTC | 03:11 PT
We are seeing further improvements on the performance issues, on POD5. We are monitoring the situation closely. More info to follow.
10:44 UTC | 02:44 PT
The performance issues on POD5 has improved. Thank you for your patience, while we continue to progress on our remediation work.
10:14 UTC | 02:14 PT
We are continuing to progress in our resolution of the performance issues, on POD5. More info to follow.
09:59 UTC | 01:59 PT
We are still working on performance issues with Ticket Updates and Talk on POD5. More info to follow.
09:41 UTC | 01:41 PT
We are currently working on performance issues on POD5. More info to follow.
During this incident, customers experienced degraded performance in multiple functionalities. The degradation delayed the processing of jobs from all applications. A bug in the trigger logic in a Zendesk IP caused a specific trigger to create outgoing Direct Messages when it should have instead created public Tweets. These outgoing DMs were subsequently imported as ticket comments, on which triggers would run, generate an outgoing direct message, and therefore create a loop. This loop caused an increase of memory usage in our memory stores until they became unavailable. To remediate the incident and return the service to normal, we cleaned up the enqueued tasks to free up memory. We also manually updated items in the database so that jobs wouldn’t pick them up again and re-enqueue themselves. Once the surge of requests was identified, we disabled the offending trigger, which temporarily resolved the issue until we fixed the bug to prevent a reoccurrence of the issue.
FOR MORE INFORMATION
For current system status information about your Zendesk, check out our system status page. During an incident, you can also receive status updates by following @ZendeskOps on Twitter. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.