Incident Period: Friday, 7th March 2025, 9:25 AM (GMT) - Sunday, 9th March 2025, 3:00 PM (GMT)
Issue: Various checks in the UAE, including Onfido checks, were not running. This caused many applications to be stuck in the "Waiting for Checks" stage.
Root Cause: On Sunday, our engineering team discovered that the service responsible for running the checks was not correctly deployed in the latest deployment to the UAE.
Resolution: An emergency deployment was carried out in the UAE at 3:00 PM on Sunday, 9th March 2025. Following this deployment, all checks in the UAE resumed normal operation.
Upon investigation, we identified a backlog of unprocessed jobs in UAE, indicating that no work was being performed by the integration service. Upon analyzing the latest deployment in the UAE, which was rolled out on Friday, 7th March at 9:22 AM, we discovered that the new integration workers were not running in the UAE.
We deployed the workers to the UAE environment. Once the workers were operational in the UAE, the checks began to run. However, due to the high backlog, providers started rate limiting the checks. To minimize the impact, we reduced the number of workers.