Skip to content

Conversation

@mssonicbld
Copy link
Collaborator

What I did

Fixed the AttributeError caused by missing delete_field method in the StateDBHelper class when managing DPU state transitions. The code was attempting to call state_db.delete_field() to remove the 'transition_start_time' field from the database, but this method didn't exist, causing crashes when DPUs were in bad state.

How I did it

Added the missing delete_field method to the StateDBHelper class that properly removes fields from Redis using the hdel command
Maintained the existing logic in set_state_transition_in_progress that removes 'transition_start_time' from both local state and database when transitioning to 'False'
Ensured consistency between local dictionary state and database state by properly implementing field deletion
The fix addresses the root cause of the AttributeError while preserving the intended behavior of cleaning up transition timestamps when state transitions complete.

Code Changes:

Added delete_field(self, table, key, field) method to StateDBHelper class
Method uses client.hdel(redis_key, field) to properly delete fields from Redis database
Existing call to state_db.delete_field() on line 85 now works correctly

How to verify it

Make the DPU midplane unreachable for one of the DPUs
Toggle the DPU ON/OFF state a couple of times using config chassis modules startup/shutdown DPUx
Verify that the commands complete without AttributeError crashes
Confirm that 'transition_start_time' field is properly removed from STATE_DB when transitions complete
Check that both local state and database state remain synchronized
Why this approach vs. removing the line:

This fix maintains the original intended behavior of cleaning up transition timestamps, ensuring database consistency and preventing stale data accumulation, while properly implementing the missing functionality that was causing the crash.

Which release branch to backport (provide reason below if selected)
[x] 202505
[x] 202506

…n DPUs in bad state

### What I did

Fixed the AttributeError caused by missing [delete_field](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) method in the [StateDBHelper](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) class when managing DPU state transitions. The code was attempting to call [state_db.delete_field()](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) to remove the 'transition_start_time' field from the database, but this method didn't exist, causing crashes when DPUs were in bad state.

### How I did it

Added the missing [delete_field](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) method to the [StateDBHelper](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) class that properly removes fields from Redis using the [hdel](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) command
Maintained the existing logic in [set_state_transition_in_progress](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) that removes 'transition_start_time' from both local state and database when transitioning to 'False'
Ensured consistency between local dictionary state and database state by properly implementing field deletion
The fix addresses the root cause of the AttributeError while preserving the intended behavior of cleaning up transition timestamps when state transitions complete.

Code Changes:

Added [delete_field(self, table, key, field)](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) method to [StateDBHelper](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) class
Method uses [client.hdel(redis_key, field)](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) to properly delete fields from Redis database
Existing call to [state_db.delete_field()](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) on line 85 now works correctly

### How to verify it

Make the DPU midplane unreachable for one of the DPUs
Toggle the DPU ON/OFF state a couple of times using config chassis modules startup/shutdown DPUx
Verify that the commands complete without AttributeError crashes
Confirm that 'transition_start_time' field is properly removed from STATE_DB when transitions complete
Check that both local state and database state remain synchronized
Why this approach vs. removing the line:

This fix maintains the original intended behavior of cleaning up transition timestamps, ensuring database consistency and preventing stale data accumulation, while properly implementing the missing functionality that was causing the crash.

Which release branch to backport (provide reason below if selected)
[x] 202505
[x] 202506
@mssonicbld
Copy link
Collaborator Author

Original PR: sonic-net/sonic-utilities#4064

@mssonicbld
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@mssonicbld mssonicbld merged commit c3eecb8 into Azure:202506 Oct 13, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant