DialogTech Partial Call Interruption - Post Mortem
Issue Description:
As part of an infrastructure modernization project, we have been migrating the DialogTech platform out of a physical data center and into a cloud provider. We migrated the telephony portion of the platform to the cloud on June 25th without a material incident. On July 5th, a misconfiguration that was introduced earlier in the migration was exposed by the very high post-holiday call volume. The misconfiguration artificially limited the number of simultaneous calls the platform could handle to below what the platform was capable of and below the peak of the July 5th call volume. This resulted in a large portion of callers receiving dead air, busy signals, or infinite ringing during the incident window.
Corrective Actions:
In an effort to prevent issues like this from occurring in the future, Invoca is currently auditing the configuration for all critical infrastructure associated with the DialogTech platform to ensure it is correct and stored in our versioned configuration management system. Additionally, we are improving end-to-end observability within the DialogTech platform to better understand the system state and to reduce recovery time when incidents occur.
To request a full root cause analysis for this event, please contact support@dialogtech.com. At Invoca, we do not take incidents like these lightly. We continually strive to do better and provide the best possible customer experience. We sincerely apologize for any impact or inconvenience that may have resulted from this incident. The Invoca team is here to help if you have follow-up questions or concerns.