Flaky Tests

Flaky tests are tests that produce inconsistent results when run multiple times without any code changes. These tests can pass in one run and fail in another, indicating potential issues with:

  • Race conditions
  • Timing dependencies
  • Environmental factors
  • Resource constraints
  • Async operations

How Flaky Tests are Detected

The system identifies flaky tests by analyzing test results within a single test run. A test is considered flaky if:

  1. It has multiple execution attempts with different statuses
  2. The final status differs from previous attempts

For example, if a test:

First attempt: FAILED Second attempt: PASSED

This would be identified as a flaky test because the statuses are different.

Flaky Test Dashboard

The flaky tests dashboard provides a comprehensive view of all flaky tests in your test suite. For each flaky test, you can see:

  • Test name and suite
  • Number of flake occurrences
  • First and latest flake timestamps
  • Affected projects
  • Current ownership status
  • Resolution status

Threshold Indicators

Tests are marked with different indicators based on their flakiness level:

🟢 Green Check: Test is within acceptable flakiness thresholds or has been resolved

🟡 Yellow Warning: Test exceeds the configured flaky test threshold

🔴 Red Warning: Test is more than twice the acceptable threshold

Managing Flaky Tests

You can manage flaky tests through several actions:

  1. Ownership

    • Claim ownership of a flaky test
    • Assign ownership to team members
    • Release ownership when needed
  2. Resolution

    • Mark tests as resolved when fixed
    • Automatically marked as unresolved if they flake again
    • Track resolution history
  3. Filtering View flaky tests within different timeframes:

    • All Time
    • Today
    • This Week
    • Last 2 Weeks
    • This Month

Test Case Activities

The system tracks flaky test occurrences as test case activities. Each time a test exhibits flaky behavior:

  1. A test run activity is created with the flaky flag
  2. If the test was previously marked as resolved, it will automatically be marked as unresolved
  3. A resolution change activity is created to document the change

Integration with CI/CD

The TestResult reporter automatically detects and reports flaky tests during your CI/CD pipeline. To enable flaky test detection:

  1. Install the TestResult reporter
  2. Configure your Playwright tests to use retries
  3. The reporter will automatically track test results and identify flaky behavior

Best Practices

  1. Set Appropriate Thresholds

    • Configure flaky test thresholds based on your team's quality standards
    • Monitor trends to adjust thresholds as needed
  2. Quick Response

    • Assign ownership promptly when new flaky tests are detected
    • Investigate and fix flaky tests before they impact team productivity
  3. Documentation

    • Document patterns that led to flaky tests
    • Share fixes and preventive measures with the team
  4. Regular Review

    • Schedule regular reviews of flaky tests
    • Track progress on resolution
    • Identify common patterns or areas needing architectural improvements

Contact Support