Flaky Tests
Flaky tests are tests that produce inconsistent results when run multiple times without any code changes. These tests can pass in one run and fail in another, indicating potential issues with:
- Race conditions
- Timing dependencies
- Environmental factors
- Resource constraints
- Async operations
How Flaky Tests are Detected
The system identifies flaky tests by analyzing test results within a single test run. A test is considered flaky if:
- It has multiple execution attempts with different statuses
- The final status differs from previous attempts
For example, if a test:
First attempt: FAILED Second attempt: PASSED
This would be identified as a flaky test because the statuses are different.
Flaky Test Dashboard
The flaky tests dashboard provides a comprehensive view of all flaky tests in your test suite. For each flaky test, you can see:
- Test name and suite
- Number of flake occurrences
- First and latest flake timestamps
- Affected projects
- Current ownership status
- Resolution status
Threshold Indicators
Tests are marked with different indicators based on their flakiness level:
🟢 Green Check: Test is within acceptable flakiness thresholds or has been resolved
🟡 Yellow Warning: Test exceeds the configured flaky test threshold
🔴 Red Warning: Test is more than twice the acceptable threshold
Managing Flaky Tests
You can manage flaky tests through several actions:
Ownership
- Claim ownership of a flaky test
- Assign ownership to team members
- Release ownership when needed
Resolution
- Mark tests as resolved when fixed
- Automatically marked as unresolved if they flake again
- Track resolution history
Filtering View flaky tests within different timeframes:
- All Time
- Today
- This Week
- Last 2 Weeks
- This Month
Test Case Activities
The system tracks flaky test occurrences as test case activities. Each time a test exhibits flaky behavior:
- A test run activity is created with the flaky flag
- If the test was previously marked as resolved, it will automatically be marked as unresolved
- A resolution change activity is created to document the change
Integration with CI/CD
The TestResult reporter automatically detects and reports flaky tests during your CI/CD pipeline. To enable flaky test detection:
- Install the TestResult reporter
- Configure your Playwright tests to use retries
- The reporter will automatically track test results and identify flaky behavior
Best Practices
Set Appropriate Thresholds
- Configure flaky test thresholds based on your team's quality standards
- Monitor trends to adjust thresholds as needed
Quick Response
- Assign ownership promptly when new flaky tests are detected
- Investigate and fix flaky tests before they impact team productivity
Documentation
- Document patterns that led to flaky tests
- Share fixes and preventive measures with the team
Regular Review
- Schedule regular reviews of flaky tests
- Track progress on resolution
- Identify common patterns or areas needing architectural improvements