Skip to content

feat(firmware): retry LAN chip-info probe during startup transients (closes #144)#199

Merged
cptkoolbeenz merged 3 commits into
feat/wifi-update-check-statusfrom
feat/wifi-chip-info-retry
May 12, 2026
Merged

feat(firmware): retry LAN chip-info probe during startup transients (closes #144)#199
cptkoolbeenz merged 3 commits into
feat/wifi-update-check-statusfrom
feat/wifi-chip-info-retry

Conversation

@cptkoolbeenz
Copy link
Copy Markdown
Member

Summary

Adds bounded retry policy for the LAN chip-info probe so post-PIC32-reboot WiFi-subsystem startup transients don't force an unnecessary multi-minute reflash of already-current WiFi firmware.

Right after a PIC32 update the application is up while the WiFi subsystem is still finishing startup — the first GetLanChipInfo query can transiently fail. Without retry, the WiFi version decision short-circuits to ChipInfoUnavailable and Core flows on to flash. Desktop's existing workaround was a bounded retry loop in the app; this PR moves that policy into Core where the WiFi-update decision lives.

API additions

Two new FirmwareUpdateServiceOptions properties:

  • LanChipInfoMaxAttempts (default 3) — total tries before giving up
  • LanChipInfoRetryDelay (default 2s) — wait between attempts

Worst-case wait = (MaxAttempts - 1) × RetryDelay = 4s by default, which fits the observed startup window without slowing steady state. The retry loop observes cancellation between attempts.

Wired into CheckWifiFirmwareStatusCoreAsync so both the legacy IsWifiFirmwareUpToDateAsync hit-path AND the new (PR #198) CheckWifiFirmwareStatusAsync planning method get the retry behavior automatically.

Stacked on PR #198

This branch is built on top of feat/wifi-update-check-status — it depends on the WifiFirmwareStatus shape introduced there. Needs PR #198 to merge first; this PR's diff against main shows only the incremental changes once #198 lands.

Test plan

  • 3 new tests cover transient-failure recovery within budget, exhausted-budget falls through to ChipInfoUnavailable, steady-state success doesn't trigger retry overhead
  • Extended FakeLanChipInfoStreamingDevice with transientFailuresBeforeSuccess + GetLanChipInfoCallCount instrumentation
  • All 15 LanChipInfo-related tests pass
  • Full suite: 898/900 (2 skipped require live hardware)
  • Build clean on net9.0 + net10.0

Closes #144

…loses #144)

Right after a PIC32 firmware update the application is up while the
WiFi subsystem is still finishing startup, so the first GetLanChipInfo
query can transiently fail. Without retry, the WiFi version decision
short-circuits to ChipInfoUnavailable and the desktop layer ends up
running an unnecessary multi-minute reflash of already-current
firmware. Desktop's workaround was a bounded retry loop in the app —
this PR moves that policy into Core where the WiFi-update decision
itself lives.

Adds two FirmwareUpdateServiceOptions properties:
- LanChipInfoMaxAttempts (default 3) — total tries before giving up
- LanChipInfoRetryDelay (default 2s) — wait between attempts

Worst-case wait = (MaxAttempts - 1) * RetryDelay = 4s by default,
which fits the observed startup window without slowing steady state.
The retry loop observes cancellation between attempts.

Wired into CheckWifiFirmwareStatusCoreAsync so both the legacy
IsWifiFirmwareUpToDateAsync hit-path AND the new (PR #198)
CheckWifiFirmwareStatusAsync planning method get the retry behavior
for free.

Test plan: 3 new tests cover (a) transient failure recovery within
budget, (b) exhausted budget falls through to ChipInfoUnavailable,
(c) steady-state success doesn't trigger retry overhead. Extended
FakeLanChipInfoStreamingDevice with transientFailuresBeforeSuccess +
GetLanChipInfoCallCount instrumentation. 898/900 pass (2 skipped
require live hardware).

Stacks on PR #198 — needs to land first.
@cptkoolbeenz cptkoolbeenz requested a review from a team as a code owner May 12, 2026 04:52
@cptkoolbeenz
Copy link
Copy Markdown
Member Author

/improve

@cptkoolbeenz
Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add retry policy for LAN chip-info probe during startup transients

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add bounded retry policy for LAN chip-info probe during WiFi startup
• Prevent unnecessary reflash of up-to-date WiFi firmware post-PIC32 reboot
• Introduce LanChipInfoMaxAttempts and LanChipInfoRetryDelay options
• Add comprehensive test coverage for retry behavior and edge cases
Diagram
flowchart LR
  A["CheckWifiFirmwareStatusCoreAsync"] --> B["TryGetLanChipInfoWithRetryAsync"]
  B --> C["Retry Loop"]
  C --> D["GetLanChipInfoAsync"]
  D --> E{Success?}
  E -->|Yes| F["Return ChipInfo"]
  E -->|No| G{Attempts Left?}
  G -->|Yes| H["Delay & Retry"]
  H --> D
  G -->|No| I["Return null"]
  I --> J["ChipInfoUnavailable"]
  F --> K["Continue Status Check"]
Loading

Grey Divider

File Changes

1. src/Daqifi.Core/Firmware/FirmwareUpdateServiceOptions.cs ⚙️ Configuration changes +28/-0

Add LAN chip-info retry configuration options

• Add LanChipInfoMaxAttempts property (default 3) for total retry attempts
• Add LanChipInfoRetryDelay property (default 2 seconds) for delay between attempts
• Add validation for both new properties in Validate() method
• Document retry behavior and worst-case wait time (4 seconds default)

src/Daqifi.Core/Firmware/FirmwareUpdateServiceOptions.cs


2. src/Daqifi.Core/Firmware/FirmwareUpdateService.cs ✨ Enhancement +60/-15

Implement retry logic for LAN chip-info queries

• Extract retry logic into new TryGetLanChipInfoWithRetryAsync() method
• Implement bounded retry loop with configurable attempts and delays
• Observe cancellation between retry attempts
• Add detailed debug logging for each attempt and final exhaustion
• Replace inline try-catch with retry-aware method call

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs


3. src/Daqifi.Core.Tests/Firmware/FirmwareUpdateServiceTests.cs 🧪 Tests +129/-1

Add comprehensive retry behavior test coverage

• Add test for transient failure recovery within retry budget
• Add test for exhausted retry budget falling through to ChipInfoUnavailable
• Add test for steady-state success without retry overhead
• Extend FakeLanChipInfoStreamingDevice with transientFailuresBeforeSuccess parameter
• Add GetLanChipInfoCallCount instrumentation to track call attempts

src/Daqifi.Core.Tests/Firmware/FirmwareUpdateServiceTests.cs


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 12, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Context used

Grey Divider


Remediation recommended

1. Retry exceeds expected duration ✓ Resolved 🐞 Bug ➹ Performance
Description
The new LAN chip-info retry loop can block far longer than the documented
“(MaxAttempts-1)×RetryDelay” because each attempt can spend up to its own chip-info query timeout
while holding _operationLock. With current defaults on DaqifiStreamingDevice (2s query timeout),
a timeout-heavy path can take ~10s (3×2s attempts + 2×2s delays), not ~4s, delaying update flows and
UI status probes.
Code

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[R614-652]

+        var maxAttempts = Math.Max(1, _options.LanChipInfoMaxAttempts);
+        var retryDelay = _options.LanChipInfoRetryDelay;
+
+        for (var attempt = 1; attempt <= maxAttempts; attempt++)
+        {
+            cancellationToken.ThrowIfCancellationRequested();
+
+            try
+            {
+                var chipInfo = await lanChipInfoProvider.GetLanChipInfoAsync(cancellationToken).ConfigureAwait(false);
+                if (chipInfo != null)
+                {
+                    if (attempt > 1)
+                    {
+                        _logger.LogDebug(
+                            "LAN chip-info query succeeded on attempt {Attempt}/{Max}.",
+                            attempt,
+                            maxAttempts);
+                    }
+                    return chipInfo;
+                }
+                _logger.LogDebug(
+                    "LAN chip-info query returned null on attempt {Attempt}/{Max}.",
+                    attempt,
+                    maxAttempts);
+            }
+            catch (Exception ex) when (ex is not OperationCanceledException)
+            {
+                _logger.LogDebug(
+                    ex,
+                    "LAN chip-info query failed on attempt {Attempt}/{Max}.",
+                    attempt,
+                    maxAttempts);
+            }
+
+            if (attempt < maxAttempts)
+            {
+                await Task.Delay(retryDelay, cancellationToken).ConfigureAwait(false);
+            }
Evidence
The retry loop delays between attempts while holding _operationLock, so total blocking time
includes both the configured delay and each attempt’s own query time/timeout; the default device
implementation uses a 2s response timeout per attempt, making the real worst-case substantially
larger than the options’ “worst-case wait” description.

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[195-232]
src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[539-547]
src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[610-652]
src/Daqifi.Core/Firmware/FirmwareUpdateServiceOptions.cs[122-138]
src/Daqifi.Core/Device/DaqifiStreamingDevice.cs[842-856]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`TryGetLanChipInfoWithRetryAsync` bounds only the *inter-attempt delay*; the overall wall-clock time is also dominated by per-attempt `GetLanChipInfoAsync` latency/timeouts. With the default `DaqifiStreamingDevice` implementation using a 2000ms response timeout, the default retry settings can block for ~10s, contradicting the current options documentation and potentially stalling firmware flows while `_operationLock` is held.

### Issue Context
- `CheckWifiFirmwareStatusAsync` acquires `_operationLock` for the entire status probe, including retry delays.
- `DaqifiStreamingDevice.GetLanChipInfoAsync` uses a fixed `responseTimeoutMs: 2000` per attempt.
- Options XML docs currently describe worst-case wait as `(LanChipInfoMaxAttempts - 1) * LanChipInfoRetryDelay`, which excludes per-attempt timeout/latency.

### Fix Focus Areas
- src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[195-232]
- src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[610-659]
- src/Daqifi.Core/Firmware/FirmwareUpdateServiceOptions.cs[122-138]
- src/Daqifi.Core/Device/DaqifiStreamingDevice.cs[842-856]

### Implementation guidance
Choose one (or combine):
1) **Enforce a true overall time budget** for the chip-info probe (e.g., new option like `LanChipInfoTotalTimeout`), implemented via a linked `CancellationTokenSource.CancelAfter(...)` around the whole retry loop (or `Stopwatch` to avoid sleeping past the remaining budget).
2) **Make per-attempt chip-info timeout configurable** (or lower it specifically for chip-info probing) so defaults align with the intended startup window.
3) **Clarify documentation**: explicitly state the configured wait is *delay-only* and total latency is `sum(attempt durations) + sum(delays)`, and update the default “worst-case” example accordingly (e.g., include the 2s per-attempt timeout for the default device implementation).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Previous review results

Review updated until commit 01fd011

Results up to commit 215cd84


🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)


Remediation recommended
1. Retry exceeds expected duration ✓ Resolved 🐞 Bug ➹ Performance
Description
The new LAN chip-info retry loop can block far longer than the documented
“(MaxAttempts-1)×RetryDelay” because each attempt can spend up to its own chip-info query timeout
while holding _operationLock. With current defaults on DaqifiStreamingDevice (2s query timeout),
a timeout-heavy path can take ~10s (3×2s attempts + 2×2s delays), not ~4s, delaying update flows and
UI status probes.
Code

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[R614-652]

+        var maxAttempts = Math.Max(1, _options.LanChipInfoMaxAttempts);
+        var retryDelay = _options.LanChipInfoRetryDelay;
+
+        for (var attempt = 1; attempt <= maxAttempts; attempt++)
+        {
+            cancellationToken.ThrowIfCancellationRequested();
+
+            try
+            {
+                var chipInfo = await lanChipInfoProvider.GetLanChipInfoAsync(cancellationToken).ConfigureAwait(false);
+                if (chipInfo != null)
+                {
+                    if (attempt > 1)
+                    {
+                        _logger.LogDebug(
+                            "LAN chip-info query succeeded on attempt {Attempt}/{Max}.",
+                            attempt,
+                            maxAttempts);
+                    }
+                    return chipInfo;
+                }
+                _logger.LogDebug(
+                    "LAN chip-info query returned null on attempt {Attempt}/{Max}.",
+                    attempt,
+                    maxAttempts);
+            }
+            catch (Exception ex) when (ex is not OperationCanceledException)
+            {
+                _logger.LogDebug(
+                    ex,
+                    "LAN chip-info query failed on attempt {Attempt}/{Max}.",
+                    attempt,
+                    maxAttempts);
+            }
+
+            if (attempt < maxAttempts)
+            {
+                await Task.Delay(retryDelay, cancellationToken).ConfigureAwait(false);
+            }
Evidence
The retry loop delays between attempts while holding _operationLock, so total blocking time
includes both the configured delay and each attempt’s own query time/timeout; the default device
implementation uses a 2s response timeout per attempt, making the real worst-case substantially
larger than the options’ “worst-case wait” description.

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[195-232]
src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[539-547]
src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[610-652]
src/Daqifi.Core/Firmware/FirmwareUpdateServiceOptions.cs[122-138]
src/Daqifi.Core/Device/DaqifiStreamingDevice.cs[842-856]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`TryGetLanChipInfoWithRetryAsync` bounds only the *inter-attempt delay*; the overall wall-clock time is also dominated by per-attempt `GetLanChipInfoAsync` latency/timeouts. With the default `DaqifiStreamingDevice` implementation using a 2000ms response timeout, the default retry settings can block for ~10s, contradicting the current options documentation and potentially stalling firmware flows while `_operationLock` is held.

### Issue Context
- `CheckWifiFirmwareStatusAsync` acquires `_operationLock` for the entire status probe, including retry delays.
- `DaqifiStreamingDevice.GetLanChipInfoAsync` uses a fixed `responseTimeoutMs: 2000` per attempt.
- Options XML docs currently describe worst-case wait as `(LanChipInfoMaxAttempts - 1) * LanChipInfoRetryDelay`, which excludes per-attempt timeout/latency.

### Fix Focus Areas
- src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[195-232]
- src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[610-659]
- src/Daqifi.Core/Firmware/FirmwareUpdateServiceOptions.cs[122-138]
- src/Daqifi.Core/Device/DaqifiStreamingDevice.cs[842-856]

### Implementation guidance
Choose one (or combine):
1) **Enforce a true overall time budget** for the chip-info probe (e.g., new option like `LanChipInfoTotalTimeout`), implemented via a linked `CancellationTokenSource.CancelAfter(...)` around the whole retry loop (or `Stopwatch` to avoid sleeping past the remaining budget).
2) **Make per-attempt chip-info timeout configurable** (or lower it specifically for chip-info probing) so defaults align with the intended startup window.
3) **Clarify documentation**: explicitly state the configured wait is *delay-only* and total latency is `sum(attempt durations) + sum(delays)`, and update the default “worst-case” example accordingly (e.g., include the 2s per-attempt timeout for the default device implementation).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 12, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Remediation recommended

1. Retry exceeds expected duration ✓ Resolved 🐞 Bug ➹ Performance
Description
The new LAN chip-info retry loop can block far longer than the documented
“(MaxAttempts-1)×RetryDelay” because each attempt can spend up to its own chip-info query timeout
while holding _operationLock. With current defaults on DaqifiStreamingDevice (2s query timeout),
a timeout-heavy path can take ~10s (3×2s attempts + 2×2s delays), not ~4s, delaying update flows and
UI status probes.
Code

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[R614-652]

+        var maxAttempts = Math.Max(1, _options.LanChipInfoMaxAttempts);
+        var retryDelay = _options.LanChipInfoRetryDelay;
+
+        for (var attempt = 1; attempt <= maxAttempts; attempt++)
+        {
+            cancellationToken.ThrowIfCancellationRequested();
+
+            try
+            {
+                var chipInfo = await lanChipInfoProvider.GetLanChipInfoAsync(cancellationToken).ConfigureAwait(false);
+                if (chipInfo != null)
+                {
+                    if (attempt > 1)
+                    {
+                        _logger.LogDebug(
+                            "LAN chip-info query succeeded on attempt {Attempt}/{Max}.",
+                            attempt,
+                            maxAttempts);
+                    }
+                    return chipInfo;
+                }
+                _logger.LogDebug(
+                    "LAN chip-info query returned null on attempt {Attempt}/{Max}.",
+                    attempt,
+                    maxAttempts);
+            }
+            catch (Exception ex) when (ex is not OperationCanceledException)
+            {
+                _logger.LogDebug(
+                    ex,
+                    "LAN chip-info query failed on attempt {Attempt}/{Max}.",
+                    attempt,
+                    maxAttempts);
+            }
+
+            if (attempt < maxAttempts)
+            {
+                await Task.Delay(retryDelay, cancellationToken).ConfigureAwait(false);
+            }
Evidence
The retry loop delays between attempts while holding _operationLock, so total blocking time
includes both the configured delay and each attempt’s own query time/timeout; the default device
implementation uses a 2s response timeout per attempt, making the real worst-case substantially
larger than the options’ “worst-case wait” description.

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[195-232]
src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[539-547]
src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[610-652]
src/Daqifi.Core/Firmware/FirmwareUpdateServiceOptions.cs[122-138]
src/Daqifi.Core/Device/DaqifiStreamingDevice.cs[842-856]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`TryGetLanChipInfoWithRetryAsync` bounds only the *inter-attempt delay*; the overall wall-clock time is also dominated by per-attempt `GetLanChipInfoAsync` latency/timeouts. With the default `DaqifiStreamingDevice` implementation using a 2000ms response timeout, the default retry settings can block for ~10s, contradicting the current options documentation and potentially stalling firmware flows while `_operationLock` is held.

### Issue Context
- `CheckWifiFirmwareStatusAsync` acquires `_operationLock` for the entire status probe, including retry delays.
- `DaqifiStreamingDevice.GetLanChipInfoAsync` uses a fixed `responseTimeoutMs: 2000` per attempt.
- Options XML docs currently describe worst-case wait as `(LanChipInfoMaxAttempts - 1) * LanChipInfoRetryDelay`, which excludes per-attempt timeout/latency.

### Fix Focus Areas
- src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[195-232]
- src/Daqifi.Core/Firmware/FirmwareUpdateService.cs[610-659]
- src/Daqifi.Core/Firmware/FirmwareUpdateServiceOptions.cs[122-138]
- src/Daqifi.Core/Device/DaqifiStreamingDevice.cs[842-856]

### Implementation guidance
Choose one (or combine):
1) **Enforce a true overall time budget** for the chip-info probe (e.g., new option like `LanChipInfoTotalTimeout`), implemented via a linked `CancellationTokenSource.CancelAfter(...)` around the whole retry loop (or `Stopwatch` to avoid sleeping past the remaining budget).
2) **Make per-attempt chip-info timeout configurable** (or lower it specifically for chip-info probing) so defaults align with the intended startup window.
3) **Clarify documentation**: explicitly state the configured wait is *delay-only* and total latency is `sum(attempt durations) + sum(delays)`, and update the default “worst-case” example accordingly (e.g., include the 2s per-attempt timeout for the default device implementation).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 12, 2026

PR Code Suggestions ✨

Latest suggestions up to 01fd011

Warning

/improve is deprecated. Use /agentic_review instead (removed after 2026-05-31).

CategorySuggestion                                                                                                                                    Impact
General
Replace cancellation exception flow

Use direct boolean checks for cancellation tokens instead of relying on a
locally thrown and caught OperationCanceledException.

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs [38-56]

 for (var attempt = 1; attempt <= maxAttempts; attempt++)
 {
-    try
+    if (cancellationToken.IsCancellationRequested)
     {
-        linkedToken.ThrowIfCancellationRequested();
+        cancellationToken.ThrowIfCancellationRequested();
     }
-    catch (OperationCanceledException) when (timeoutCts.IsCancellationRequested && !cancellationToken.IsCancellationRequested)
+
+    if (timeoutCts?.IsCancellationRequested == true)
     {
         _logger.LogDebug(
             "LAN chip-info probe hit total timeout ({Timeout}) before attempt {Attempt}/{Max}.",
             totalTimeout,
             attempt,
             maxAttempts);
         return null;
     }
 
     try
     {
         var chipInfo = await lanChipInfoProvider.GetLanChipInfoAsync(linkedToken).ConfigureAwait(false);
         ...
     }
     ...
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why: Replacing the exception-driven flow with direct cancellation token checks improves code readability and avoids unnecessary exception throwing and catching logic on the happy path.

Low
Possible issue
Prevent timeout CTS construction failures

Handle infinite or extremely large timeout values defensively when creating the
CancellationTokenSource.

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs [25-36]

 var totalTimeout = _options.LanChipInfoTotalTimeout;
 
 // Wall-clock budget guards against the pathological case where
 // attempt-count × per-attempt-timeout + retry-delay sum vastly
 // exceeds the configured retry budget (e.g., 3 × 2s device timeout
 // + 2 × 2s delay = ~10s while _operationLock is held). Linking
 // the caller's CT preserves cancellation semantics; the timeout
 // CTS just adds a deadline.
-using var timeoutCts = new CancellationTokenSource(totalTimeout);
-using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(
-    cancellationToken, timeoutCts.Token);
-var linkedToken = linkedCts.Token;
+CancellationTokenSource? timeoutCts = null;
+CancellationTokenSource linkedCts;
 
+if (totalTimeout == Timeout.InfiniteTimeSpan)
+{
+    linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
+}
+else
+{
+    timeoutCts = new CancellationTokenSource(totalTimeout);
+    linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, timeoutCts.Token);
+}
+
+using (timeoutCts)
+using (linkedCts)
+{
+    var linkedToken = linkedCts.Token;
+    ...
+}
+

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 3

__

Why: The ValidatePositive check on the options likely prevents infinite or negative timeouts, making this an unlikely edge case, though it remains valid defensive programming.

Low
  • More

Previous suggestions

Suggestions up to commit 34c23d6
CategorySuggestion                                                                                                                                    Impact
General
Simulate async failures accurately

Return a faulted task instead of throwing synchronously to better simulate async
failures.

src/Daqifi.Core.Tests/Firmware/FirmwareUpdateServiceTests.cs [217-226]

 public Task<LanChipInfo?> GetLanChipInfoAsync(CancellationToken cancellationToken = default)
 {
     GetLanChipInfoCallCount++;
     if (_remainingTransientFailures > 0)
     {
         _remainingTransientFailures--;
-        throw new InvalidOperationException("Simulated transient post-reboot failure.");
+        return Task.FromException<LanChipInfo?>(
+            new InvalidOperationException("Simulated transient post-reboot failure."));
     }
     return Task.FromResult(_chipInfo);
 }
Suggestion importance[1-10]: 5

__

Why: Returning a faulted task using Task.FromException accurately simulates the behavior of genuine async methods, making the tests more robust.

Low
Suggestions up to commit 215cd84
CategorySuggestion                                                                                                                                    Impact
General
Cap retries to prevent hangs

Add a hard upper bound of 20 to the LanChipInfoMaxAttempts retry loop to prevent
excessive delays if the value is misconfigured.

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs [26-62]

+var maxAttempts = Math.Max(1, _options.LanChipInfoMaxAttempts);
+maxAttempts = Math.Min(maxAttempts, 20);
+
+var retryDelay = _options.LanChipInfoRetryDelay;
+if (retryDelay < TimeSpan.Zero)
+{
+    retryDelay = TimeSpan.Zero;
+}
+
 for (var attempt = 1; attempt <= maxAttempts; attempt++)
 {
     cancellationToken.ThrowIfCancellationRequested();
 
     try
     {
         var chipInfo = await lanChipInfoProvider.GetLanChipInfoAsync(cancellationToken).ConfigureAwait(false);
         if (chipInfo != null)
         {
             if (attempt > 1)
             {
                 _logger.LogDebug(
                     "LAN chip-info query succeeded on attempt {Attempt}/{Max}.",
                     attempt,
                     maxAttempts);
             }
             return chipInfo;
         }
         _logger.LogDebug(
             "LAN chip-info query returned null on attempt {Attempt}/{Max}.",
             attempt,
             maxAttempts);
     }
     catch (Exception ex) when (ex is not OperationCanceledException)
     {
         _logger.LogDebug(
             ex,
             "LAN chip-info query failed on attempt {Attempt}/{Max}.",
             attempt,
             maxAttempts);
     }
 
     if (attempt < maxAttempts)
     {
         await Task.Delay(retryDelay, cancellationToken).ConfigureAwait(false);
     }
 }
Suggestion importance[1-10]: 2

__

Why: Hardcoding a maximum attempt cap overrides the user's explicit configuration and introduces an unnecessary magic number, which is generally an anti-pattern.

Low
Possible issue
Guard against invalid retry delay

Clamp the retry delay to TimeSpan.Zero to prevent potential exceptions if
options validation is bypassed.

src/Daqifi.Core/Firmware/FirmwareUpdateService.cs [23-62]

 var maxAttempts = Math.Max(1, _options.LanChipInfoMaxAttempts);
 var retryDelay = _options.LanChipInfoRetryDelay;
+if (retryDelay < TimeSpan.Zero)
+{
+    retryDelay = TimeSpan.Zero;
+}
 
 for (var attempt = 1; attempt <= maxAttempts; attempt++)
 {
     cancellationToken.ThrowIfCancellationRequested();
     ...
     if (attempt < maxAttempts)
     {
         await Task.Delay(retryDelay, cancellationToken).ConfigureAwait(false);
     }
 }
Suggestion importance[1-10]: 1

__

Why: The LanChipInfoRetryDelay is already validated via ValidatePositive in FirmwareUpdateServiceOptions.cs, making this runtime clamping redundant and potentially masking configuration errors.

Low

…ll-clock time

Qodo correctly flagged that per-attempt query timeouts compound with
retry delays, so the actual worst-case blocking can be ~10s with the
default DaqifiStreamingDevice 2s response timeout (3 attempts × 2s
+ 2 × 2s delays), not the 4s the docs claimed. Bad enough by itself,
worse because the retry loop holds _operationLock the whole time.

Added LanChipInfoTotalTimeout option (default 8s) — a hard wall-clock
cap enforced by a linked CancellationTokenSource around the entire
retry loop. Caller's cancellation token is honored as before; the
new timeout just adds a deadline. When hit, the loop short-circuits
to ChipInfoUnavailable instead of letting the per-attempt timeouts
keep accumulating.

Updated LanChipInfoMaxAttempts docstring to reflect the actual
worst-case math (sum of attempt durations + delays) and point at
LanChipInfoTotalTimeout for hard bounds.

Test: SlowFakeLanChipInfoStreamingDevice with 200ms attempt latency
+ 100ms TotalTimeout asserts the probe bails in <1500ms instead of
the 3s a naive impl would take with 10 max attempts.
@cptkoolbeenz
Copy link
Copy Markdown
Member Author

/improve

@cptkoolbeenz
Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 12, 2026

Code Review by Qodo

Grey Divider

New Review Started

This review has been superseded by a new analysis

Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

Persistent suggestions updated to latest commit 34c23d6

…ync throw

Test fake's GetLanChipInfoAsync now returns Task.FromException for
the simulated transient failure case, matching how a real async
method surfaces errors. Behavior of the production retry loop is
identical (both forms get caught by the same try/await), but the
test now exercises the more honest async exception path.
@cptkoolbeenz
Copy link
Copy Markdown
Member Author

/improve

@cptkoolbeenz
Copy link
Copy Markdown
Member Author

/agentic_review

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 12, 2026

Persistent review updated to latest commit 01fd011

@qodo-code-review
Copy link
Copy Markdown

Persistent suggestions updated to latest commit 01fd011

@cptkoolbeenz cptkoolbeenz merged commit df91d5c into feat/wifi-update-check-status May 12, 2026
@cptkoolbeenz cptkoolbeenz deleted the feat/wifi-chip-info-retry branch May 12, 2026 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant