How to debug VM deployment issues (capacity issues?) #12192
-
|
My cluster is currently unable to deploy any new VMs due to a capacity planner issue. I can't tell what resource it is thinking it is out of. I can live migrate any vms that are running ok. Infact, I shutdown almost all the VMs running and I still can't start one back up that I shut down. I'm running 4.22. I just rebooted each node to slot in new disks, I had to shut down some VMs for this because I didn't have enough RAM to live migrate all. And it was when I tried to restart the VMs that this all started to go sideways. All nodes are online. All hosts show healthy. Not sure where to go from here. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 3 replies
-
|
@bradh352 From the logs I see it is trying to start the VR associated with the VM's as well and is failing.. Can you attach full management server log ( from all the management servers) |
Beta Was this translation helpful? Give feedback.
-
|
I am also seeing something weird, one of my hosts is listed 3 times. |
Beta Was this translation helpful? Give feedback.
-
|
I think I got it back up. Looks like one of the virtual routers was in a bad state. It was showing as up, but then there was another VR for the same VPC in the stopped state. Deleting the stopped one and the running one for the vpc, then telling it to restart the vpc ultimately resolved the issue. |
Beta Was this translation helpful? Give feedback.
-
|
@bradh352 I see below in the logs 2025-12-04 14:33:19,426 DEBUG [o.a.c.s.a.ClusterScopeStoragePoolAllocator] (Work-Job-Executor-6:[ctx-8a912d50, job-21766/job-21769, ctx-77b2d201]) (logid:de988890) Using volume allocation algorithm random to reorder pools. What is the state of the volume associated with the VR? r-661-VM. The deployment/Start of the VR is failing causing the VM starts to fail.. Can you try to do network restart with cleanup=true |
Beta Was this translation helpful? Give feedback.
@bradh352 I see below in the logs
2025-12-04 14:33:19,426 DEBUG [o.a.c.s.a.ClusterScopeStoragePoolAllocator] (Work-Job-Executor-6:[ctx-8a912d50, job-21766/job-21769, ctx-77b2d201]) (logid:de988890) Using volume allocation algorithm random to reorder pools.
2025-12-04 14:33:19,426 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-6:[ctx-8a912d50, job-21766/job-21769, ctx-77b2d201]) (logid:de988890) Trying to find suitable pools to allocate volume [Volume {"id":1701,"instanceId":661,"name":"ROOT-661","uuid":"09fa0cd6-5db0-4029-aa23-45e5d285574a","volumeType":"ROOT"}] necessary to deploy VM [VM instance {"id":661,"instanceName":"r-661-VM","state":"Starting","type":"DomainRouter"…