Downgraded Performance

Incident Report for Mews

Postmortem

Overview

On September 8th, 2023, between 9:18 AM and 13:28 AM UTC, users experienced a temporary slowdown while using Mews services such as Mews Operations and Mews Guest Journey. Data transfers to third parties, such as channel manager integrations, were also delayed.

Causes

During the mentioned timeframe, an internal component used to process asynchronous tasks such as payments, emails etc. caused increased usage of our database, leading to performance degradation. We traced the root cause to a particular component that couldn’t complete its task resulting in excessive locking of shared resources.

Action

We detected the performance issue at 9:18 AM, and our dedicated team promptly investigated. We eliminated multiple possible contributing factors and implemented a solution to restore normal service. At 11:00 am UTC, we mitigated the issue and the user experience returned to normal. At 13:30 am UTC the asynchronous tasks returned to normal as well.

Solutions

To prevent similar incidents from occurring in the future, we will be updating the problematic component.

Our Commitment

We are committed to continuously enhancing our systems and processes to provide a seamless experience. We sincerely apologize for the inconvenience caused to our users.

Posted Sep 25, 2023 - 13:07 CEST

Resolved

This incident has been resolved.
Posted Sep 08, 2023 - 22:40 CEST

Monitoring

Our latest mitigation efforts have led to improvements in the responsiveness of the Mews Application. We are now monitoring the performance.
We would like to state that automated processes such as channel manager updates and emails will be processed automatically.
As regards to updates to payments, we are still reviewing the payment states and will confirm them as soon as they are final.
Posted Sep 08, 2023 - 15:56 CEST

Update

As part of this incident we have observed delays in automated processes such as channel manager updates, emails, automatic settlements.
Posted Sep 08, 2023 - 15:27 CEST

Update

Mitigation efforts have led to improvements in customer experience metrics such as app responsiveness. This will result in customer experience return closer to normal. We are still investigating the underlying issue.
Posted Sep 08, 2023 - 14:28 CEST

Update

While the investigation of the underlying cause is still ongoing, we are increasing the resources available for our database to mitigate the issue.
Posted Sep 08, 2023 - 13:13 CEST

Update

We've reverted to the pre-deployment version of the recent change. We did not observe sufficient improvement and will continue investigating other possible causes.
Posted Sep 08, 2023 - 12:10 CEST

Identified

We have identified a recent change that is causing high utilization of our database. We are reverting back to the previous version.
Posted Sep 08, 2023 - 11:44 CEST

Investigating

We have noticed a downgraded performance of the system.
We are currently looking into it and will provide an update as soon as possible.
Posted Sep 08, 2023 - 11:40 CEST
This incident affected: Operations, Guest Journey, Business Intelligence, Payments, Open API, and Marketplace.