Skip to content

Conversation

Copy link

Copilot AI commented Nov 23, 2025

Two concurrency bugs could prevent circuits from opening correctly or allow thundering herds during recovery.

Issues

EWMABreaker race condition: Concurrent observations read raw values (0.0/1.0) instead of the actual EWMA due to non-atomic read-modify-write:

// Before: Thread B reads Thread A's raw swap value
failureRate := fromStore(e.failureRate.Swap(toStore(value)))
// ...compute EWMA...
e.failureRate.Store(toStore(failureRate))  // Too late, corruption occurred

Half-open state: Multiple goroutines could simultaneously transition from half-open and all pass through, violating the "~1 call" limit.

Changes

  • EWMABreaker.observe(): Replace swap-compute-store with CAS loop to atomically update EWMA based on actual previous value
  • Circuit.stateForCall(): Use CAS to ensure only one goroutine wins half-open transition; losers re-check state in loop
  • race_test.go: Add concurrency tests validating EWMA correctness and half-open limiting under load
// After: Atomic read-modify-write via CAS loop
for {
    oldBits := e.failureRate.Load()
    oldRate := fromStore(oldBits)
    failureRate = (value * e.decay) + (oldRate * (1 - e.decay))
    if e.failureRate.CompareAndSwap(oldBits, toStore(failureRate)) {
        break
    }
}

Verified with race detector and stress tests showing consistent max 1 concurrent call in half-open state.

Original prompt

This repository imolements a golang circuit breaker. Can you identify any issues in the code base? Limit yourself to issues that would cause a significant breach of expectations, e.g.: circuits not opening on errors, concurrency causing unrelated calls to fail, etc.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI self-assigned this Nov 23, 2025
Copilot AI and others added 3 commits November 23, 2025 10:47
- Fix EWMA breaker race condition using CAS loop for atomic EWMA updates
- Fix half-open state to properly limit concurrent calls to ~1
- Add comprehensive concurrency tests

Co-authored-by: costela <[email protected]>
- Replace recursion with loop in stateForCall to avoid potential stack overflow
- Clarify comments about window swap timing

Co-authored-by: costela <[email protected]>
- Fix percentage symbol in test assertion message
- Improve comment clarity in window swap logic

Co-authored-by: costela <[email protected]>
Copilot AI changed the title [WIP] Identify significant issues in circuit breaker implementation Fix race conditions in EWMA calculation and half-open state transitions Nov 23, 2025
Copilot finished work on behalf of costela November 23, 2025 11:04
Copilot AI requested a review from costela November 23, 2025 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants