Aduro is currently experiencing a service outage due to an ongoing Amazon AWS API gateway degradation incident. This incident is causing widespread service outages for many software vendors that rely upon the Amazon AWS service.
Per Amazon:“10:59 AM PDT We continue to see elevated error rates and latencies for invokes on API Gateway endpoints in the US-WEST-2 Region. While engineers continue to work towards root cause, we have deployed traffic filters from sources with significant increases in traffic prior to the event. As a result of these traffic filters, we are seeing a reduction in error rates and latencies, but continue to work towards full recovery. Although error rates are improving, we do not yet have an ETA for full recovery. The issue is also affecting API requests to some AWS services, including those listed below. Amazon Connect is experiencing increased failures in handling new calls, chats, and tasks as well as issues with user login in the US-WEST-2 Region. We will continue to provide updates as we progress.
More information here: https://health.aws.amazon.com/health/status
We apologize for any inconvenience this has caused and are working with Amazon AWS to get service restored. We will update you as we learn more.
-----------------------------------------------------------------------------------------------------------------
Amazon has identified the root cause of their AWS outage and advises that service may continue to be intermittent. They do not have mitigations to recommend to vendors such as Aduro, but advise that service is likely to improve over the next 2 hours as they deploy corrective measures. We continue to monitor and respond to all inbound support requests from your members and employees.
Per Amazon:12:26 PM PDT We continue to see an improvement in error rates and latencies for invokes on API Gateway endpoints in the US-WEST-2 Region, but have not fully resolved the issue. While our mitigations have improved error rates and latencies, we have also identified the root cause of the event. The subsystem responsible for request processing experienced increased load, which ultimately led to contention of a component within the affected subsystem. Engineers have been working to resolve the contention of the affected component, which has led to a reduction of error rates and latencies. The path to full recovery involves addressing the contention across the subsystem, which we are currently doing. As that progresses over the next two hours, we expect recovery to continue to improve. Customers with applications that use API Gateway will be experiencing elevated error rates and latencies as a result of this issue. Lambda is not affected by this event, but customers using API Gateway as an HTTP endpoint for Lambda will experience increased error rates and latencies. Other AWS services listed below are also experiencing elevated error rates as a result of this issue. For customers that have dependencies on API Gateway and are experiencing error rates, we do not have any mitigations to recommend to address the issue on the customer side. We do expect error rates to continue to improve as contention with the affected subsystem resides, and will provide further updates as recovery progresses.
More information here: https://health.aws.amazon.com/health/status
Comments
0 comments
Article is closed for comments.