Skip to content

Conversation

@kvaps
Copy link
Member

@kvaps kvaps commented Jun 30, 2025

What this PR does / why we need it:

This PR adds logic that prevents the controller from trying to attach or detach a volume when the VMI status already shows the volume is in the desired state.

Which issue(s) this PR fixes

We ran into a problem while a volume was being attached. The attacher logged:

I0630 08:45:50.046983       1 csi_handler.go:243] "Error processing" VolumeAttachment="csi-6da36ea7aa56330cd69c4061f0a7e0266a7e08d3dd819943acab55922f5e764d" err="failed to detach: rpc error: code = Internal desc = ControllerUnpublishVolume failed for pvc-0645b5aa-75c0-4f00-ac2d-a9998256fa65: Message: 'Resource 'pvc-0645b5aa-75c0-4f00-ac2d-a9998256fa65' is still in use.'; Cause: 'Resource is mounted/in use.'; Details: 'Node: plo-csxhk-004, Resource: pvc-0645b5aa-75c0-4f00-ac2d-a9998256fa65'; Correction: 'Un-mount resource 'pvc-0645b5aa-75c0-4f00-ac2d-a9998256fa65' on the node 'plo-csxhk-004'.'; Reports: '[685C0CB4-00000-002093]'"

On the node the disk was already attached to the VM, yet the corresponding VolumeAttachment had a deletion timestamp.

So we got stuck: the hp-volume pod couldn’t start because it was waiting for the VolumeAttachment to disappear, and the VolumeAttachment couldn’t be removed because the VM was still using the disk.

This may be a race: kubevirt-csi-controller thought the disk had been detached (the hp-volume pod was gone), while the VM was still holding it. Kubernetes then created a new VolumeAttachment in the tenant cluster, triggering another attach operation.

Special notes for your reviewer:

Release note:

Fix deadlock while reattaching volume

@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 30, 2025
@kvaps kvaps force-pushed the main branch 3 times, most recently from 08a6174 to 43c63ad Compare June 30, 2025 20:13
@kvaps kvaps changed the title main Fix deadlock while reattaching volume Jun 30, 2025
@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 30, 2025
@kvaps
Copy link
Member Author

kvaps commented Jun 30, 2025

/assign @awels

Copy link
Member

@awels awels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 2, 2025
@kubevirt-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: awels

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 2, 2025
@awels
Copy link
Member

awels commented Jul 2, 2025

Been crazy busy, finally got some time to look. Looks great, thanks for the PR.

@kubevirt-bot kubevirt-bot merged commit faea0fe into kubevirt:main Jul 2, 2025
8 checks passed
kvaps added a commit to cozystack/cozystack that referenced this pull request Jul 3, 2025
…1135)

## What this PR does


This pr imports upstream fix for volume reattaching procedure
- kubevirt/csi-driver#143

### Release note

<!--  Write a release note:
- Explain what has changed internally and for users.
- Start with the same [label] as in the PR title
- Follow the guidelines at
https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md.
-->

```release-note
[kubernetes] Fix dead-lock while reattaching a KubeVirt-CSI volume
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Improved volume management for virtual machines by adding checks to
skip unnecessary attach or detach operations when the volume is already
in the desired state.

* **Tests**
* Added new unit tests to verify optimized volume attach/detach
workflows and ensure fast-path logic is functioning correctly.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Comment on lines +358 to +366
// Fast-path: nothing to do if the volume is already attached
attached, err := c.virtClient.EnsureVolumeAvailableVM(ctx, c.infraClusterNamespace, vmName, dvName)
if err != nil {
return nil, err
}
if attached {
klog.V(3).Infof("Volume %s already attached to VM %s - skipping hot-plug", dvName, vmName)
return &csi.ControllerPublishVolumeResponse{}, nil
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies for only noticing this today. isn't this risky? the volume is only available to the kubevirt VM once the vmi status reflects that. so ctrlpublish may return success prior to that

@akalenyu
Copy link
Collaborator

akalenyu commented Jul 21, 2025

any chance you could provide the reproducer? I don't fully understand tbh and I don't think the PR does what the description states

This PR adds logic that prevents the controller from trying to attach or detach a volume when the VMI status already shows the volume is in the desired state.

EDIT:
doesn't isVolumeAttached/removePossible guarantee the same things as the additions in this PR?
both gate the calls to the actual hotplug request

@kvaps
Copy link
Member Author

kvaps commented Aug 4, 2025

Hey, I just added an additional check

if volume still attached (is in vmi spec) do not remove finalizer from it

if volume detached (is missing in vmi spec) safely remove finalizer

itsgottabered pushed a commit to itsgottabered/csi-driver that referenced this pull request Aug 5, 2025
@alifelan
Copy link

Hey! I was looking into recent commits, and stumbled upon this. I may lack some context, but I think this may be causing a regression, or at least, using different sources of truth (vm.Spec.Template.Spec.Volumes and vmi.Status.VolumeStatus).

For a ControllerPublishVolume, we can run into a scenario like this:

  1. We kick off the attachment, and run virtctl addvolume.
  2. We wait for the volume to be available in vmi.Status.VolumeStatus.
  3. For some reason (slow attach, faulty volume, etc), it's not ready in 2 minutes, and we timeout the operation.
  4. The hotplug pod is still not healthy, and won't come up.
  5. We run ControllerPublishVolume again, and run through our initial check.
  6. EnsureVolumeAvailableVM will return true, the volume shows up in vm.Spec.Template.Spec.Volumes since we added it in AddVolumeToVM.
  7. We succeed ControllerPublishVolume.
  8. The volume is not available in the VM, but the VolumeAttachment is now ready, and we are going through NodeStageVolume, where the problem may expose itself in a different issue (Failed to fetch device by serialID probably).

For a ControllerUnpublishVolume, it would look like this:

  1. We kick off the detachment, and run virtctl removevolume
  2. We wait for the volume to be removed from vmi.Status.VolumeStatus.
  3. For some reason (new hotplug pod cannot come up, so we cannot delete the old one), the volume is not released in 2 minutes, and we timeout the operation.
  4. The new hotplug pod doesn't come up, so our volume still shows up in the VolumeStatus in a detaching state.
  5. We run ControllerUnpublishVolume again, and run through our initial check.
  6. EnsureVolumeRemovedVM goes over vm.Spec.Template.Spec.Volumes, and it is not there since we removed it in removeVolumeFromVm, so it returns true.
  7. We succeed ControllerUnpublishVolume.
  8. The volume is still hanging in the VMI, but the VolumeAttachment is gone.
  9. A new pod gets scheduled using this PVC, and a new VolumeAttachment is created
  10. We go over the same ControllerPublishVolume scenario mentioned above, with a multi-attach error, impacting all incoming volumes (worth mentioning Add multi-attach error to RWO volumes #161 here, which I believe will also help address the problem from this PR).

This stems from using a different source of truth between the initial check (EnsureVolumeAvailable / EnsureVolumeRemoved), and the fast return check (EnsureVolumeAvailableVM / EnsureVolumeRemovedVM). Is there a reason for this decision? What should we treat as the source of truth? Is this what we would expect to happen (VolumeAttachment ready / deleted when the VMI status is not in the desired state)?

@akalenyu
Copy link
Collaborator

Hey! I was looking into recent commits, and stumbled upon this. I may lack some context, but I think this may be causing a regression, or at least, using different sources of truth (vm.Spec.Template.Spec.Volumes and vmi.Status.VolumeStatus).

For a ControllerPublishVolume, we can run into a scenario like this:

  1. We kick off the attachment, and run virtctl addvolume.
  2. We wait for the volume to be available in vmi.Status.VolumeStatus.
  3. For some reason (slow attach, faulty volume, etc), it's not ready in 2 minutes, and we timeout the operation.
  4. The hotplug pod is still not healthy, and won't come up.
  5. We run ControllerPublishVolume again, and run through our initial check.
  6. EnsureVolumeAvailableVM will return true, the volume shows up in vm.Spec.Template.Spec.Volumes since we added it in AddVolumeToVM.
  7. We succeed ControllerPublishVolume.
  8. The volume is not available in the VM, but the VolumeAttachment is now ready, and we are going through NodeStageVolume, where the problem may expose itself in a different issue (Failed to fetch device by serialID probably).

For a ControllerUnpublishVolume, it would look like this:

  1. We kick off the detachment, and run virtctl removevolume
  2. We wait for the volume to be removed from vmi.Status.VolumeStatus.
  3. For some reason (new hotplug pod cannot come up, so we cannot delete the old one), the volume is not released in 2 minutes, and we timeout the operation.
  4. The new hotplug pod doesn't come up, so our volume still shows up in the VolumeStatus in a detaching state.
  5. We run ControllerUnpublishVolume again, and run through our initial check.
  6. EnsureVolumeRemovedVM goes over vm.Spec.Template.Spec.Volumes, and it is not there since we removed it in removeVolumeFromVm, so it returns true.
  7. We succeed ControllerUnpublishVolume.
  8. The volume is still hanging in the VMI, but the VolumeAttachment is gone.
  9. A new pod gets scheduled using this PVC, and a new VolumeAttachment is created
  10. We go over the same ControllerPublishVolume scenario mentioned above, with a multi-attach error, impacting all incoming volumes (worth mentioning Add multi-attach error to RWO volumes #161 here, which I believe will also help address the problem from this PR).

This stems from using a different source of truth between the initial check (EnsureVolumeAvailable / EnsureVolumeRemoved), and the fast return check (EnsureVolumeAvailableVM / EnsureVolumeRemovedVM). Is there a reason for this decision? What should we treat as the source of truth? Is this what we would expect to happen (VolumeAttachment ready / deleted when the VMI status is not in the desired state)?

Yeah I was alluding to the same with my review in #143 (review) and the following comment. I did not fully understand the problem/handling of it - #143 (comment).
Also, I think the author might've added more commits on top in their own fork since - #143 (comment)

@alifelan
Copy link

Was there anything that made you change your mind? Have we seen any issues related to this behaviour? While I don't have experience deploying a KubeVirt CSI Driver with these changes, at a first glance I believe we will be impacted by them. We run into timeouts frequently, and after the second attempt, our VolumeAttachments won't be reflecting the actual state

@akalenyu
Copy link
Collaborator

Was there anything that made you change your mind? Have we seen any issues related to this behaviour? While I don't have experience deploying a KubeVirt CSI Driver with these changes, at a first glance I believe we will be impacted by them. We run into timeouts frequently, and after the second attempt, our VolumeAttachments won't be reflecting the actual state

oh, I didn't change my mind - I never pulled this commit into the productized fork.
I do believe we want to take a step back with this change and discuss the problem/solutions in one of the sig-storage meetings. I agree with your assessment.

LoneExile pushed a commit to LoneExile/cozystack that referenced this pull request Oct 29, 2025
…ozystack#1135)

## What this PR does


This pr imports upstream fix for volume reattaching procedure
- kubevirt/csi-driver#143

### Release note

<!--  Write a release note:
- Explain what has changed internally and for users.
- Start with the same [label] as in the PR title
- Follow the guidelines at
https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md.
-->

```release-note
[kubernetes] Fix dead-lock while reattaching a KubeVirt-CSI volume
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Improved volume management for virtual machines by adding checks to
skip unnecessary attach or detach operations when the volume is already
in the desired state.

* **Tests**
* Added new unit tests to verify optimized volume attach/detach
workflows and ensure fast-path logic is functioning correctly.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants