Errors in admin console and web portals
Incident Report for UserVoice
Postmortem

On October 26th between 10:20 and 10:35 PDT UserVoice experienced an infrastructure issue that caused intermittent system unavailability.

Business Impact

During the outage end-users and admins may have been unable to load or interact with UserVoice sites or widgets.

Root Cause

One of the servers in Uservoice's database cluster experienced an application stall event.  Due to misconfiguration of our cluster, this caused a pause in database writes. Our engineers were required to manually remove the affected node to allow the cluster to resume writing.

What we are Doing to Prevent This

We have updated our database cluster configuration to more aggressively monitor and remove cluster members that are non-performant.

This caused an interruption for your team in UserVoice and your end users trying to view and submit feedback. We are sorry for the pain point this caused. We have already put improvements in place to prevent this type of issue from happening again.

If you have any questions, don’t hesitate to reach out at claire.talbott@uservoice.com.

Claire Talbott

Support Manager

Posted Oct 31, 2018 - 10:03 EDT

Resolved
This incident has been resolved.
Posted Oct 26, 2018 - 14:13 EDT
Monitoring
The system has recovered and operating normally. We'll be checking to determine root cause and follow with a post mortem.
Posted Oct 26, 2018 - 13:43 EDT
Update
We are continuing to investigate this issue.
Posted Oct 26, 2018 - 13:31 EDT
Investigating
We are investigating 520 and 502 errors in UserVoice admin consoles and on web portals.
Posted Oct 26, 2018 - 13:30 EDT
This incident affected: Web Portal (subdomain) and Admin Console.