Degraded Performance - Core Application

Incident Report for RefAssured

Resolved

After monitoring service for the last few hours and confirmation from AWS, this incident has been resolved.

Post-Mortem: Authentication Service Disruption
January 25, 2025

Incident Summary
On January 25, 2025, RefAssured users experienced authentication errors when accessing the platform through Bullhorn. The incident lasted approximately 10 minutes, from 11:30 AM to 11:40 AM PST, affecting users' ability to sign in to the service.

Timeline
11:21 AM PST - AWS Cognito (US-EAST-2) begins experiencing increased API error rates
11:25 AM PST - AWS engineering team automatically engaged
11:30 AM PST - RefAssured users begin experiencing authentication issues
11:37 AM PST - AWS implements mitigation action, recovery begins
11:40 AM PST - RefAssured authentication errors resolved
12:08 PM PST - AWS confirms full service recovery across all affected services

What Happened
Our authentication system relies on Amazon Web Services (AWS) to verify user logins. AWS experienced a technical issue when they deployed a software update to one of their core systems. This update caused problems with their authentication services across multiple regions, which prevented RefAssured users from logging in.
AWS quickly identified the problematic update and rolled it back to restore service. While the broader AWS issue took longer to fully resolve, RefAssured users were able to log in normally again within 10 minutes.

Impact
Duration: 10 minutes of user impact (11:30 AM - 11:40 AM PST)
Scope: Users attempting to log into RefAssured via Bullhorn
Effect: New logins were blocked; users already logged in were not affected

Resolution
AWS rolled back their problematic software update, which restored RefAssured's authentication service by 11:40 AM PST. No action was required from RefAssured users.

What We're Doing Next
1) Better Monitoring: We're adding extra alerts to detect authentication issues faster
2) Improved Communication: We're reviewing how quickly we communicate during outages
3) Backup Options: We're exploring additional authentication options to reduce reliance on any single provider

Prevention
While this issue was caused by our service provider and outside our direct control, we're taking steps to make our system more resilient and improve how we respond to similar situations in the future.

We sincerely apologize for the inconvenience this caused. If you have questions about this incident, please contact our support team.
Posted Jul 25, 2025 - 22:18 UTC

Monitoring

Issue Summary: Between 11:30 AM and 11:40 AM PST, some users experienced authentication errors when accessing RefAssured within Bullhorn.

Root Cause: We identified the issue stemmed from our authentication provider, AWS Cognito.

Resolution: The service has been restored to normal operation. We are working with AWS Support to conduct a full root cause analysis and will publish a detailed post-mortem once complete.

Current Status: All systems are operating normally. We are continuing to monitor closely as a precautionary measure.
Posted Jul 25, 2025 - 19:34 UTC

Investigating

We are currently investigating an incident where integrated Bullhorn customers have reported authentication issues. While this seems to have resolved, we are currently investigating the root cause.
Posted Jul 25, 2025 - 18:37 UTC
This incident affected: RefAssured Apps (RefAssured App for Staffing Agencies) and Bullhorn Integrations (RefAssured for Bullhorn (ATS/CRM), RefAssured for Bullhorn Talent Platform (OTE)).