On Friday 11/2/18 from 5AM to 5:13AM PDT, UserVoice experienced downtime.
During the time of the incidents end users and admins would have seen 500 errors. They wouldn’t have been able to load the admin console, use the API, interact with ideas on the front end or use the widget or Contributor Sidebar. Email would have been delayed, but no emails were lost.
We saw an issue similar to last Friday’s incident where one of the servers in Uservoice's database cluster experienced an application stall event. This caused a pause in database writes. Our engineering team manually removed the affected node to allow the cluster to resume operation.
Our team has been focused, since last week, on finding the root issue that is caused one of our database clusters to stall. This work is still in progress. Once the root issue is identified, we will be implementing a fix and updating this report with the information discovered.
In the meantime, we have put increased alerting in place so that should the issue repeat, we will identify it immediately.
We understand UserVoice being down is an interruption to you and your team, and impacts your workflows! We take this downtime seriously, and are all hands on deck to get this issue fully addressed, so we can prevent it happening again.
If you have any questions or feedback for us about the incident please don’t hesitate to contact me at firstname.lastname@example.org.