Problem
On 27 Aug 2025 at 06:45 UTC, following a main database configuration change, the POS production environment became unresponsive, returning “502 Bad Gateway” errors and preventing all active users from accessing the product.
Action
The team reverted the database configuration change, rebooted the database without failover, and replaced server workers to restore service.
Causes
The outage was triggered by a database parameter change to enable cross-cloud replication, followed by a reboot with failover.
Solutions
Application resilience will be improved with enhanced monitoring, better handling of dropped database connections and faster server worker restarts.