Skip to content

Commit a180243

Browse files
authored
deflake some Crucible replacement tests (#9217)
This should fix flake opportunities in both: * `crucible_replacements::test_delete_volume_region_replacement_state_replacement_done` * `crucible_replacements::test_Delete_volume_region_replacement_state_requested` This is my proposed change in #9216. From the logs of a flake and careful reading it looks like we were almost-but-not-quite handling the condition described in `wait_for_request_state` if the start and end states for a replacement request are the same. Consulting `get_region_replacement_by_id` and the returned replacement request state tells us when the step has completed, in addition to checking that the step had started.
1 parent 773799e commit a180243

File tree

1 file changed

+32
-7
lines changed

1 file changed

+32
-7
lines changed

nexus/tests/integration_tests/crucible_replacements.rs

Lines changed: 32 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -546,9 +546,12 @@ mod region_replacement {
546546
//
547547
// If `wait_for_request_state` has the same expected start and end
548548
// state (as it does above), it's possible to exit that function
549-
// having not yet started the saga yet, and this requires an
550-
// additional `wait_for_condition` to wait for the expected recorded
551-
// step.
549+
// having not yet started the saga yet. To check the saga has
550+
// started, we'll wait for the replacement request step to be
551+
// created by it. If the saga is still in progress, the step may be
552+
// recorded while the saga is still `Driving`, so we wait for the
553+
// saga to return to `Running` as evidence the step has been fully
554+
// driven.
552555

553556
let most_recent_step = wait_for_condition(
554557
|| {
@@ -565,12 +568,34 @@ mod region_replacement {
565568
.await
566569
.unwrap()
567570
{
568-
Some(step) => Ok(step),
571+
Some(step) => {
572+
// The saga has either started or completed. To
573+
// tell if it's completed and we can move on,
574+
// check on the replacement state again. If
575+
// we're not `Running`, the saga is still in
576+
// progress.
577+
let state = datastore
578+
.get_region_replacement_request_by_id(
579+
&opctx,
580+
replacement_request_id,
581+
)
582+
.await
583+
.unwrap()
584+
.replacement_state;
585+
586+
if state == RegionReplacementState::Running {
587+
Ok(step)
588+
} else {
589+
// The replacement step is in progress, but
590+
// not done yet. We're still waiting, but
591+
// probably not for long.
592+
Err(CondCheckError::<()>::NotYet)
593+
}
594+
}
569595

570596
None => {
571-
// The saga either has not started yet or is
572-
// still running - see the comment before this
573-
// check for more info.
597+
// The saga has not started, so we're not done
598+
// waiting.
574599
Err(CondCheckError::<()>::NotYet)
575600
}
576601
}

0 commit comments

Comments
 (0)