Inbound and outbound call failures
Incident Report for Gradwell Communications Ltd
Postmortem

Description of outage and impact:

On Tuesday 23 July Gradwell engineers were alerted to new call setup failures that impacted inbound and outbound calls. However, this incident did not impact any inflight calls. Gradwell engineers were fully engaged in the incident which was given a priority one status.

Root Cause & Resolution:

The root cause of the issue was traced to a fault in a vendor's software, causing a cascade failure which affected the system's ability to create new calls. This type of failure occurs when an initial error triggers a series of additional failures in the system. Fortunately, Gradwell’s automated monitoring and response systems were already in place, allowing us to quickly detect and resolve the problem without manual intervention.

Prevention of recurrence:

Gradwell have identified further enhancements which can be made to speed up the automated processes to resolve these types of failures in more quickly. We have also identified other improvements to better isolate our systems from the vendor. Please accept Gradwell’s apologies for the service disruption and the impact it has had on your business.

Posted Aug 02, 2024 - 14:58 BST

Resolved
Hello,

The incident has been resolved. We will provide an RFO within 10 working days.

Kind Regards,
Gradwell Service Team
Posted Jul 23, 2024 - 17:49 BST
Monitoring
Hello,

We have implemented a fix and we can see the call traffic returning to normal. Please retest and call our support teams if your calls are still experiencing any issues.

Gradwell apologises for any inconvenience and disruption this has caused you and wants to thank you for your patience.
Posted Jul 23, 2024 - 17:27 BST
Investigating
Hello,

We are seeing further reports of inbound call failures. We are currently investigating at the highest priority.

Gradwell apologies any inconvenience and disruption this has caused you and want to thank you for your patience whilst we investigate and resolve this incident.

We will provide a further update within the next hour.
Posted Jul 23, 2024 - 16:41 BST
Identified
We are seeing inbound calls on numbers through one upstream supplier continue to fail. Inbound calls on numbers through other suppliers are routing correctly.

Please be advised this has been escalated to the highest priority & our teams are working to restore the service as soon as possible.

Gradwell apologies any inconvenience and disruption this has caused you and want to thank you for your patience whilst we investigate and resolve this incident.

We will provide a further update within the next hour.
Posted Jul 23, 2024 - 16:39 BST
Monitoring
Hello,

We have implemented a fix and we can see the call traffic returning to normal. Please retest and call our support teams if your calls are still experiencing any issues.

Gradwell apologises for any inconvenience and disruption this has caused you and wants to thank you for your patience.
Posted Jul 23, 2024 - 16:01 BST
Investigating
Hello,

We are aware of inbound and outbound call failures. Please be advised this has been escalated to the highest priority & our teams are working to restore the service as soon as possible.

Gradwell apologies any inconvenience and disruption this has caused you and want to thank you for your patience whilst we investigate and resolve this incident.

We will provide a further update within the next hour.

Kind Regards,
Gradwell Communications
Posted Jul 23, 2024 - 15:52 BST
This incident affected: Portals and Website (Gradwell customer control panels) and Voice & Calls Services (Multi User VoIP, Outbound SIP Trunking, Inbound SIP trunking, Single User VoIP, Wave).