Skip to content

[internal-dns] register and publish ddmd in the switch zone#10381

Open
zeeshanlakhani wants to merge 1 commit intomainfrom
zl/ddm-internal-dns
Open

[internal-dns] register and publish ddmd in the switch zone#10381
zeeshanlakhani wants to merge 1 commit intomainfrom
zl/ddm-internal-dns

Conversation

@zeeshanlakhani
Copy link
Copy Markdown
Collaborator

DDMD has always run in the switch zone alongside Dendrite, MGS, and MGD, but it was never registered in internal DNS, leaving no path for a cross-host consumer to discover it. This adds ServiceName::Ddm, plumbs ddm_port through the host-zone switch (RSS plan + reconfigurator DNS execution), threads an Overridables::ddm_ports map for the test suite, and lands a DdmInstance dropshot sim in test utils so that the test harness registers a real DDM port in DNS the same way it does for the other switch-zone services.

We also drop the duplicate DDMD_PORT const in ddm-admin-client in favor of the canonical omicron_common::address::DDMD_PORT. Same-host callers continue to use Client::localhost().

This was extracted from the multicast PR (zl/multicast-mgd-ddm), which uses ddmd cross-host as the first DNS-resolved consumer, as Nexus is the consumer.

DDMD has always run in the switch zone alongside Dendrite, MGS,
and MGD, but it was never registered in internal DNS, leaving no path for a
cross-host consumer to discover it. This adds `ServiceName::Ddm`,
plumbs `ddm_port` through the host-zone switch (RSS plan + reconfigurator
DNS execution), threads an `Overridables::ddm_ports` map for the
test suite, and lands a `DdmInstance` dropshot sim in test utils so
that the test harness registers a real DDM port in DNS the same way it does
for the other switch-zone services.

We also drop the duplicate DDMD_PORT const in `ddm-admin-client` in favor of
the canonical `omicron_common::address::DDMD_PORT`. Same-host
callers continue to use `Client::localhost()`.

This was extracted from the multicast PR (zl/multicast-mgd-ddm), which
uses ddmd cross-host as the first DNS-resolved consumer, as Nexus is the consumer.
Copy link
Copy Markdown
Contributor

@jgallagher jgallagher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for splitting this out!

dendrite_port: u16,
mgs_port: u16,
mgd_port: u16,
ddm_port: u16,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function taking 3 u16s in a row was already sketchy - what do you think about changing this to take something like

struct HostSwitchZonePorts {
    dendrite: u16,
    mgs: u16,
    mgd: u16,
    ddm: u16,
}

instead of four separate arguments? That way we get named parameters for the call sites, and don't have to rely on them getting the order correct, and inside this function, we can destructure them:

let HostSwitchZonePorts { dendrite, ...all the rest ... } = ports;

so we also get a compile-time check that we update this function if we change the struct?

/// In-process stand-in for the `ddmd` (Delay Driven Multipath daemon)
/// admin API.
///
/// `ddmd` runs in sled global zones and switch zones in real deployments,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the ddmd that runs in the global zones ever need to be in DNS too, or is that one only communicated with locally?

/// internal DNS as `ServiceName::Ddm`.
///
/// This currently has no registered routes. Any integration needing
/// concrete endpoints (e.g., peer lists) must extend the `ApiDescription`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a little fishy - I think the other test-utils variants of these services is running something approximating the "real" service (albeit a stub binary for non-illumos systems - it still handles all the normal API endpoints). If this doesn't serve the same API, how can tests use it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants