Increased error rate

Incident Report for Mews

Postmortem

Problem

Increased number of errors in production. Some clients were affected for 1 hour.

Action

During investigation, the team noticed a recent deployment which consisted of several database migrations. Breaking and non-breaking migrations executed in wrong order caused columns in Account and User tables to become out of sync. Team manually copied correct values from obsolete columns into the new columns.

Causes

When data migration consists of column removal it would be split into several data migrations. Non-breaking for altering triggers which would be executed immediately and breaking for drop column operations which would be scheduled. That time gap leads to out-of-sync data and errors.

Solutions

We will adjust migration logic to have the trigger modifications together with the column removal in the same migration.

Posted Jan 25, 2022 - 13:13 CET

Resolved

This incident has been resolved.
Posted Jan 11, 2022 - 14:53 CET

Update

We are continuing to monitor for any further issues.
Posted Jan 11, 2022 - 14:14 CET

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Jan 11, 2022 - 14:09 CET

Identified

The issue has been identified and a fix is being implemented.
Posted Jan 11, 2022 - 13:41 CET

Investigating

We are experiencing a higher rate of errors on some api calls. We are investigating the root cause
Posted Jan 11, 2022 - 13:32 CET
This incident affected: Mews Operations, Mews Guest Experience, Mews Business Intelligence, Mews Payments, Mews Open API, and Mews Marketplace.