[ddmd] add --no-state-machine flag for test fixtures and Linux build#729
Open
zeeshanlakhani wants to merge 2 commits intomainfrom
Open
[ddmd] add --no-state-machine flag for test fixtures and Linux build#729zeeshanlakhani wants to merge 2 commits intomainfrom
zeeshanlakhani wants to merge 2 commits intomainfrom
Conversation
6a10770 to
6dcf010
Compare
Omicron's oxidecomputer/omicron#10381 introduces a stubbed `ddmd` admin endpoint because spawning a real `ddmd` in a generic test toolchain is not viable: the routing state machine (discovery, exchange, route synchronization) depends on illumos networking facilities the toolchain does not provide. Consumers of the stub, e.g., Nexus RPW (multicast members), sled-agent's DDM reconciler, and anything that resolves the DDM internal-DNS service name, cannot exercise the real admin surface from Omicron's test harness. This work adds an opt-in `--no-state-machine` flag to `ddmd` that runs only the admin API server and skips the state machine entirely, allowing the fixture to spawn the real binary. This is analogous to `mgd --no-bgp-dispatcher`, which Omicron's `MgdInstance` already uses for the same purpose. To make the fixture path usable on Linux, `ddmd` itself must build on Linux. The previous code pulled the illumos-only crates `libnet`, `dpd-client`, `opte-ioctl`, and `oxide-vpc` unconditionally through `ddm`, which failed to link on Linux (`-lzfs`, `-ldlpi`). This change introduces an `illumos` feature in both `ddm` and `ddmd` (default-on, mirroring `mgd`'s `mg-lower` pattern) that marks those four crates optional. The buildomat `linux.sh` job now builds `ddmd` and `ddmadm`, with `ddmd` invoked as `cargo build --bin ddmd --no-default-features`. The illumos-only halves of `ddm` are isolated by the feature gate: - The routing state machine implementation moves from `sm.rs` into `sm/state.rs`. - The exchange runtime (HTTP push/pull and route programming) moves from `exchange.rs` into `exchange/runtime.rs`. - The discovery runtime (UDPv6 solicitation/advertisement loops) moves from `discovery.rs` into `discovery/runtime.rs`. Each parent `mod.rs` keeps the platform-agnostic types and re-exports the runtime surface so existing call sites resolve unchanged on illumos. The runtime submodules are gated as a unit by `#[cfg(all(feature = "illumos", target_os = "illumos"))]`. We also remove the single-function `ddm/src/util.rs`, inlining the function into `discovery/runtime.rs`, where its sole caller lives. The SIGTERM cleanup handler is installed regardless of the flag, so Ctrl-C still exits cleanly in `--no-state-machine` mode. The imported route sets are empty in that mode, so the cleanup itself is a noop. Passing `--addr` alongside `--no-state-machine` is harmless but ignored, with a warning logged.
6dcf010 to
3b54e16
Compare
zeeshanlakhani
added a commit
to oxidecomputer/omicron
that referenced
this pull request
May 7, 2026
…fixture We address @jgallagher's review by: - Replacing the four positional `u16` arguments in `DnsConfigBuilder::host_zone_switch` with a `HostSwitchZonePorts` named-fields structure. - Replacing the dropshot-based stubbed `DdmInstance` in test-utils with a fixture that spawns and supervises a real `ddmd` subprocess running with `--no-state-machine`, analogous to `MgdInstance` and `mgd --no-bgp-dispatcher`. Only the switch-zone `ddmd` is registered in internal DNS, while sled-global-zone instances are accessed locally by their own host and don't need DNS registration. This **does** require maghemite changes, already PR'ed to oxidecomputer/maghemite#729. To make this all work, we wire `ddmd` into the developer xtask toolchain. `cargo xtask download maghemite-ddmd` reuses the existing `mg-ddm.tar.gz` illumos zone artifact (extracting `ddmd`/`ddmadm`). On Linux it overlays a raw `ddmd` binary, and on macOS it builds from source. Also, we had to bump `oxnet` from 0.1.4 to 0.1.5 to satisfy the new maghemite pin.
jgallagher
reviewed
May 7, 2026
Contributor
jgallagher
left a comment
There was a problem hiding this comment.
Just a few notes on the mechanics of the split; I'll defer to folks who know maghemite better for the code organization.
Includes: - Reject `--no-state-machine` together with `--addr` at clap level via `conflicts_with` - Collapse the two cfg-gated `termination_handler` variants into one cfg-gated body. - Rename the `illumos` Cargo feature to `state-machine` so that it describes the gated functionality (and matches the CLI flag) rather than colliding semantically with `target_os = "illumos"`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Omicron's oxidecomputer/omicron#10381 introduces a stubbed
ddmdadmin endpoint because spawning a realddmdin a generic test toolchain is not viable: the routing state machine (discovery, exchange, route synchronization) depends on illumos networking facilities the toolchain does not provide. Consumers of the stub, e.g., Nexus RPW (multicast members), sled-agent's DDM reconciler, and anything that resolves the DDM internal-DNS service name, cannot exercise the real admin surface from Omicron's test harness.This work adds an opt-in
--no-state-machineflag toddmdthat runs only the admin API server and skips the state machine entirely, allowing the fixture to spawn the real binary. This is analogous tomgd --no-bgp-dispatcher, which Omicron'sMgdInstancealready uses for the same purpose.To make the fixture path usable on Linux,
ddmditself must build on Linux. The previous code pulled the illumos-only crateslibnet,dpd-client,opte-ioctl, andoxide-vpcunconditionally throughddm, which failed to link on Linux (-lzfs,-ldlpi). This change introduces anillumosfeature in bothddmandddmd(default-on, mirroringmgd'smg-lowerpattern) that marks those four crates optional. The buildomatlinux.shjob now buildsddmdandddmadm, withddmdinvoked ascargo build --bin ddmd --no-default-features.The illumos-only halves of
ddmare isolated by the feature gate:sm.rsintosm/state.rs.exchange.rsintoexchange/runtime.rs.discovery.rsintodiscovery/runtime.rs.Each parent
mod.rskeeps the platform-agnostic types and re-exports the runtime surface so existing call sites resolve unchanged on illumos. The runtime submodules are gated as a unit by#[cfg(all(feature = "illumos", target_os = "illumos"))]. We also remove the single-functionddm/src/util.rs, inlining the function intodiscovery/runtime.rs, where its sole caller lives.The SIGTERM cleanup handler is installed regardless of the flag, so Ctrl-C still exits cleanly in
--no-state-machinemode. The imported route sets are empty in that mode, so the cleanup itself is a noop. Passing--addralongside--no-state-machineis harmless but ignored, with a warning logged.