Version
v0.10.3-0-g4d11815e6
Describe the bug.
Nico automatically applied corrective actions on a subset of nodes in one site after detecting a 5‑hour time skew between the host and the host BMC, and tried to realign the BMC timezone to UTC to match the host.
The actions Nico perfromed were:
- Power off the host
- Correct the time zone on the host BMC
- Restart the host BMC
After these steps, the affected nodes remained powered off, with their host BMC timezone set to UTC
These nodes were assigned to a tenant at the time and required manual intervention to power the nodes back on.
Further analysis showed that only a subset of nodes was impacted, specifically those with BMC lockdown disabled. On other nodes where Nico also detected a time skew, it could not make any changes because BMC lockdown was enabled. Nico continues to attempt timezone changes on those locked‑down BMCs, causing repeated log spamming from failed timezone change attempts.
This behaviour only started yesterday, so it appears to be associated with the most recent release.
Given that the nodes with mismatched BMC timezones were tenant‑assigned, it would be preferable for Nico not to enforce timezone changes when a node is assigned to a tenant. If Nico does apply changes, it should also restore the node’s power state to whatever it was before the changes. Finally, it would be desirable to stop the logs from being flooded with failed timezone change requests on BMCs in lockdown mode.
Minimum reproducible example
Relevant log output
Other/Misc.
No response
Code of Conduct
Version
v0.10.3-0-g4d11815e6
Describe the bug.
Nico automatically applied corrective actions on a subset of nodes in one site after detecting a 5‑hour time skew between the host and the host BMC, and tried to realign the BMC timezone to UTC to match the host.
The actions Nico perfromed were:
After these steps, the affected nodes remained powered off, with their host BMC timezone set to UTC
These nodes were assigned to a tenant at the time and required manual intervention to power the nodes back on.
Further analysis showed that only a subset of nodes was impacted, specifically those with BMC lockdown disabled. On other nodes where Nico also detected a time skew, it could not make any changes because BMC lockdown was enabled. Nico continues to attempt timezone changes on those locked‑down BMCs, causing repeated log spamming from failed timezone change attempts.
This behaviour only started yesterday, so it appears to be associated with the most recent release.
Given that the nodes with mismatched BMC timezones were tenant‑assigned, it would be preferable for Nico not to enforce timezone changes when a node is assigned to a tenant. If Nico does apply changes, it should also restore the node’s power state to whatever it was before the changes. Finally, it would be desirable to stop the logs from being flooded with failed timezone change requests on BMCs in lockdown mode.
Minimum reproducible example
Relevant log output
Other/Misc.
No response
Code of Conduct