-
Notifications
You must be signed in to change notification settings - Fork 1.2k
HLD for SmsrtSwitch DPU graceful shutdown #1991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 2 commits
Commits
Show all changes
37 commits
Select commit
Hold shift + click to select a range
01c62c2
Initial version for dpu-graceful-shutdown HLD
rameshraghupathy 2c8b83b
Did some minor improvement
rameshraghupathy 5020c9f
Addressed review comments
rameshraghupathy 5fa9881
Adding two approaches
rameshraghupathy cf02ac4
Did some cleanup
rameshraghupathy 8003e40
Did some cleanup
rameshraghupathy 1d96db3
Did some cleanup
rameshraghupathy f97db46
Did some cleanup
rameshraghupathy db03b8b
Did some cleanup
rameshraghupathy 6b9fe9f
Did some cleanup
rameshraghupathy 29585aa
Did some cleanup
rameshraghupathy 07afb46
Did some cleanup
rameshraghupathy a27f575
Did some cleanup
rameshraghupathy 0e4b549
Did some cleanup
rameshraghupathy e9582fd
Did some cleanup
rameshraghupathy d280d2d
Fixed the sequence diagram
rameshraghupathy 563b5c0
Fixed the sequence diagram
rameshraghupathy efc9530
Fixed the sequence flow
rameshraghupathy 60a0ad3
Addressed review comments
rameshraghupathy 6e7729a
Did some cleanup
rameshraghupathy 4545f1c
Called out that the response read happens in a 5 sec loop
rameshraghupathy ae70f24
Added a section for interaoperability
rameshraghupathy 3882f88
Did some cleanup
rameshraghupathy 9d6c8ba
Did some cleanup
rameshraghupathy 2eb24c0
Did some cleanup
rameshraghupathy cb3cd8d
Enhanced the reboot-interoperability.svg diagram
rameshraghupathy d677530
Enhanced the reboot-interoperability description
rameshraghupathy 1af23de
Addressed review comments
rameshraghupathy f8d3fd7
Addressed review comments
rameshraghupathy a67d16a
Addressed review comments
rameshraghupathy a695a67
Addressed some review comments
rameshraghupathy eb73698
Addressed some review comments
rameshraghupathy 9aaef84
modified reboot-interoperability diagram
rameshraghupathy 35a790f
addressed review comments
rameshraghupathy 4dbf08f
Addressed some review comments
rameshraghupathy 490357a
Addressed some review comments
rameshraghupathy ef08c8e
Updated the image
rameshraghupathy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
130 changes: 130 additions & 0 deletions
130
doc/smart-switch/graceful-shutdown/graceful-shutdown.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,130 @@ | ||
| # SmartSwitch DPU Graceful Shutdown | ||
|
|
||
| | Rev | Date | Author | Change Description | | ||
| | --- | ---- | ------ | ------------------ | | ||
| | 0.1 | 12/05/2025 | Ramesh Raghupathy | Initial version| | ||
|
|
||
|
|
||
| ## Definitions / Abbreviations | ||
|
|
||
| | Term | Meaning | | ||
| | --- | ---- | | ||
| | PMON | Platform Monitor | | ||
| | DLM | Device Lifecycle Manager | | ||
| | NPU | Network Processing Unit | | ||
| | DPU | Data Processing Unit | | ||
| | PDK | Platform Development Kit | | ||
| | SAI | Switch Abstraction Interface | | ||
| | GPIO | General Purpose Input Output | | ||
| | PSU | Power Supply Unit | | ||
| | I2C | Inter-integrated Circuit communication protocol | | ||
| | SysFS | Virtual File System provided by the Linux Kernel | | ||
| | CP | Control Plane | | ||
| | DP | Data Plane | | ||
|
|
||
| ## Introduction | ||
| SmartSwitch supports graceful reboot of the DPUs. Given this, it is quiet natural that we provide support for graceful shutdown of the DPUs. Though it may sound like that the graceful shutdown is the first half of graceful reboot, it is not so because the way it is invoked, the code path for the shutdown are different making the implementation little complex. Besides this, the limitation of the absence of docker, the container separation, and the platform agnostic implementation adds to the challenge of invoking the gnoi call from this code path. | ||
|
|
||
| ## DPU Graceful Shutdown Sequence | ||
|
|
||
| The following sequence diagram illustrates the detailed steps involved in the graceful shutdown of a DPU: | ||
|
|
||
| <p align="center"><img src="./images/dpu-graceful-shutdown.svg"></p> | ||
|
|
||
| ## Explanation of the Flow | ||
| * chassisd: Initiates the shutdown process by invoking set_admin_state(down) in module.py. | ||
|
|
||
| * module.py: Requests dpu_base.py to issue a gNOI reboot request for DPUx. | ||
|
|
||
| * dpu_base.py: Writes a JSON message to the host's named pipe at /host/gnoi_reboot.pipe. | ||
|
|
||
| * Host OS: Forwards the JSON message to gnoi_reboot_daemon.py via /var/run/gnoi_reboot.pipe. | ||
|
|
||
| * gnoi_reboot_daemon.py: Executes the gnoi_client with the provided parameters. | ||
|
|
||
| * gnmi container: Sends the gNOI Reboot RPC to DPUx. | ||
vvolam marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| * DPUx: Acknowledges the reboot request. | ||
|
|
||
| * Alternative Paths: | ||
|
|
||
| * Success: gnmi returns success to gnoi_reboot_daemon.py, which logs the success. | ||
|
|
||
| * Failure: gnmi returns failure to gnoi_reboot_daemon.py, which logs the failure. | ||
|
|
||
| * gnoi_reboot_daemon.py: Writes the reboot result to /var/run/gnoi_reboot_response.pipe. | ||
|
|
||
| * Host OS: Provides the reboot result to dpu_base.py via /host/gnoi_reboot_response.pipe. | ||
|
|
||
| * dpu_base.py: Returns the reboot result to module.py. | ||
|
|
||
| * module.py: Proceeds to shut down DPUx via the platform API, regardless of the reboot outcome. | ||
|
|
||
| ## Objective | ||
|
|
||
| This design enables the `chassisd` process running in the PMON container to invoke a **gNOI-based reboot** when it triggers the "set_admin_state(down)" API of a DPU module, without relying on `docker`, `bash`, or `hostexec` within the container. | ||
|
|
||
| ## Constraints | ||
|
|
||
| - The PMON container is highly restricted: no `docker`, `hostexec`, or `bash`. | ||
| - gNOI reboot requires executing a command using `docker exec` on the host. | ||
| - Communication must be initiated from PMON and executed by the host. | ||
|
|
||
| --- | ||
|
|
||
| ## Design Overview | ||
|
|
||
| The solution uses a **named pipe (FIFO)** created on the host and bind-mounted into the PMON container. PMON writes structured reboot requests (as JSON) into the pipe. A lightweight daemon running on the host listens for messages on this pipe, and executes the appropriate `docker exec` command using `gnoi_client`. | ||
|
|
||
| --- | ||
|
|
||
| ## Components | ||
|
|
||
| ### 1. Host-side Named Pipe | ||
|
|
||
| A named pipe (e.g., `/var/run/gnoi_reboot.pipe`) is created on the host. It acts as a one-way communication channel from PMON to the host. | ||
rameshraghupathy marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ### 2. Host-side Reboot Daemon | ||
|
|
||
| A long-running Python script monitors the pipe for new lines of input. Each line is expected to be a JSON string with fields like DPU name, IP, and port. | ||
|
|
||
| Upon receiving a valid request, the daemon runs a `docker exec` command that invokes `gnoi_client` with parameters to perform a gNOI reboot of the DPU module. | ||
|
|
||
| ### 3. PMON-side Hook (in `module_base.py`) | ||
|
|
||
| Within the PMON container, the `pre_shutdown_hook()` function opens the mounted pipe and writes a JSON-formatted reboot request, including the DPU name and midplane IP. | ||
|
|
||
| If writing fails (e.g., pipe is unavailable), the error is logged, and shutdown continues without halting DPU services. | ||
|
|
||
| ### 4. File Mounting | ||
|
|
||
| The named pipe on the host is mounted into PMON | ||
|
|
||
| --- | ||
|
|
||
| ## Workflow Summary | ||
|
|
||
| 1. **Initialization** | ||
| - Host creates the pipe and starts the reboot listener daemon. | ||
| - The pipe is mounted into the PMON container | ||
|
|
||
| 2. **Trigger** | ||
| - PMON’s `pre_shutdown_hook()` fetches the DPU’s midplane IP and writes a JSON message to the pipe. | ||
|
|
||
| 3. **Execution** | ||
| - The host daemon reads the message and performs `docker exec` to invoke `gnoi_client` for rebooting the DPU. | ||
|
|
||
| 4. **Result** | ||
| - Logs indicate success or failure on both PMON and host sides. | ||
| - The host can remove processed requests or keep logs as needed. | ||
|
|
||
| --- | ||
|
|
||
| ## Benefits | ||
|
|
||
| - **No PMON dependencies:** No reliance on `docker`, `bash`, or host tools inside PMON. | ||
| - **Minimal mount:** Only a single pipe file is mounted from host to container. | ||
| - **Clear separation of responsibilities:** PMON requests; host executes. | ||
|
|
||
| --- | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.