Problem
Production apps were showing only a blank page for 20 minutes.
Action
Manual rollback of frontend applications to version from previous day resolved the down state of all apps.
Causes
Changes in syntax definition of deployment pipeline caused undefined value of mandatory variable. This resulted in corrupted deployment of frontend, where all frontend files were missing. No JS files, no CSS files were available, only blank page was shown in browser.
- Syntax of pipeline definition was not properly tested before production run.
- Deployment pipeline CI run finished and was marked as successful, even though newly deployed version was corrupted.
- Automated monitoring of production apps didn't raised any alert. Site loaded successfully but with no content in it.
Solutions
- New Sandbox environment was created for developers. Now developers can properly debug and test changes in deployment pipelines before applying those changes in production pipelines.
- We will add additional sanity checks of the output from build pipeline to prevent deployment of a corrupted version.
- We will improve monitoring of production apps to immediately raise an alert in case of a blank page.