11:40 GMT+1 / 03:40 PST
We are investigating issues related to ticket updates impacting some of our customers. More info to follow
12:30 GMT+1 / 04:30 PST
We continue to investigate issues related to ticket updates. More updates to follow.
12:56 GMT+1 / 04:56 PST
We are still investigating issues related to ticket updates. We will keep you updated asap.
14:05 GMT+1 / 06:05 PST
Our team is working at fixing the issues related to ticket updates
15:21 GMT+1 / 07:21 PST
Our team is still working at fixing the issues related to ticket updates that impact some of our customers
16:35 GMT+1 / 08:35 PST
We are happy to report that the issue related to ticket updates is now cleared. Postmortem to follow.
At Zendesk, we use an Agent Collision Client (ACC) to manage agent collision and enable visibility when multiple agents are working on one ticket. In September 2015, new changes were made to ACC, and later deployed to production, effectively overwriting the working version and generating a regression.
The base problem was that ACC was not broadcasting changes to ticket subscribers (agents looking at a ticket). When two agents were looking a ticket, and one of them decided to update, the second one would not get the changes, making the ticket stale as a consequence. After that, when a second agent would update the ticket, the save action would fail, because the ticket data stored on one service would be different than that stored on the other.
We should have tested two agents updating the same ticket, at the same time. The problem manifested even with two browsers with the same user logged in.
The incident was resolved with a rollback of the deploy. To prevent future occurrences of this, we shall be working with the QA team to add more thorough tests for Agent Collision and improving our flows that help detect regressions.
FOR MORE INFORMATION
Please subscribe to this article for regular updates until the issue is resolved. If you aren't subscribed to our Twitter feed, we encourage you to do so in order to get the most current information about any service issues. We also record all site outages on our system status page where you can see the past 12 months of service uptime. If you have questions about this issue, please open a ticket with us by sending a note to firstname.lastname@example.org.