Call timeouts, failures, and no audio
Incident Report for DialogTech
Postmortem

DialogTech Partial Call Interruption - Post Mortem

Issue Description: 

As part of an infrastructure modernization project, we have been migrating the DialogTech platform out of a physical data center and into a cloud provider. We migrated the telephony portion of the platform to the cloud on June 25th without a material incident. On July 5th, a misconfiguration that was introduced earlier in the migration was exposed by the very high post-holiday call volume. The misconfiguration artificially limited the number of simultaneous calls the platform could handle to below what the platform was capable of and below the peak of the July 5th call volume. This resulted in a large portion of callers receiving dead air, busy signals, or infinite ringing during the incident window.

Corrective Actions:

In an effort to prevent issues like this from occurring in the future, Invoca is currently auditing the configuration for all critical infrastructure associated with the DialogTech platform to ensure it is correct and stored in our versioned configuration management system. Additionally, we are improving end-to-end observability within the DialogTech platform to better understand the system state and to reduce recovery time when incidents occur.

To request a full root cause analysis for this event, please contact support@dialogtech.com. At Invoca, we do not take incidents like these lightly. We continually strive to do better and provide the best possible customer experience. We sincerely apologize for any impact or inconvenience that may have resulted from this incident. The Invoca team is here to help if you have follow-up questions or concerns.

Posted Jul 11, 2022 - 10:55 CDT

Resolved
Through the day, we've had no further call failures as a result of this incident or the associated repair work.
Posted Jul 06, 2022 - 17:56 CDT
Update
We've made a configuration change in our call processing platform, we've not observed any call processing failures in the meantime and continue to monitor.
Posted Jul 06, 2022 - 10:31 CDT
Update
Calls have been connecting normally without timeouts for approximately the past hour. We're still monitoring and working towards resolution.
Posted Jul 05, 2022 - 16:32 CDT
Update
Intermittent call failures continue, we're still working to resolve this.
Posted Jul 05, 2022 - 14:22 CDT
Update
Call failures and timeouts are occurring intermittently.
Posted Jul 05, 2022 - 13:31 CDT
Monitoring
We've mitigated the call processing issue and calls will process normally at this time. We're continuing to investigate the causes for this incident and will provide more information after resolution.
Posted Jul 05, 2022 - 12:52 CDT
Identified
We're working to fix call processing issues now. Most calls will complete at this time.
Posted Jul 05, 2022 - 12:40 CDT
Update
We're still investigating.
Previous update indicated issues with call transport, in actuality this affects our ability to process and complete calls.
Posted Jul 05, 2022 - 12:34 CDT
Investigating
We're looking into problems with call processing. Some calls will fail to complete, time out, or may experience no audio.
Posted Jul 05, 2022 - 12:14 CDT
This incident affected: DialogTech Platform - Call Processing.