Skip to content

Fix Windows unittest CI: force CPU-only build (CUDA 13.2 toolkit on runner breaks _portable_lib load)#20527

Merged
digantdesai merged 2 commits into
mainfrom
fix-windows-cuda-autodetect
Jun 26, 2026
Merged

Fix Windows unittest CI: force CPU-only build (CUDA 13.2 toolkit on runner breaks _portable_lib load)#20527
digantdesai merged 2 commits into
mainfrom
fix-windows-cuda-autodetect

Conversation

@Gasoonjia

Copy link
Copy Markdown
Contributor

Summary

Fixes the Windows unittest CI breakage introduced by #20440 (Add CUDA 13.2 support and drop unsupported 12.8/12.9).

unittest / windows, unittest-editable / windows, and unittest-release / windows have been red on main since c0643f5 (parent was green).

Root cause

The Windows CI image ships CUDA toolkits on PATH (it has both v13.2 and v13.0; nvcc resolves to 13.2.78).

install_executorch auto-enables the CUDA backend when install_utils.is_cuda_available() returns True (setup.py ~L882-889), and that check is driven purely by the nvcc version being in SUPPORTED_CUDA_VERSIONS.

ImportError: DLL load failed while importing _portable_lib: The specified module could not be found.

That aborts pytest collection (24 errors during collection) and fails the job.

Fix

Add a -cpuOnly switch to the shared .ci/scripts/setup-windows.ps1 that forces -DEXECUTORCH_BUILD_CUDA=OFF via CMAKE_ARGS, and pass it from the CPU unittest workflow (_unittest.yml). This restores the pre-#20440 CPU-only behavior for these jobs.

The CUDA Windows jobs (cuda-windows.yml) call the same script without -cpuOnly, so they are unaffected and keep building CUDA.

Note / follow-up

The deeper issue is that the auto-detection keys off nvcc presence rather than whether the installed torch is actually a CUDA build. A more general fix would be to only enable EXECUTORCH_BUILD_CUDA when torch.version.cuda is set. Left out here to keep the unblock low-risk; happy to follow up.

@pytorch-bot

pytorch-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20527

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 Unclassified Failures

As of commit d2b3fac with merge base 6021a58 (image):

UNCLASSIFIED FAILURES - DrCI could not classify the following jobs because the workflow did not run on the merge base. The failures may be pre-existing on trunk or introduced by this PR:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 26, 2026
@linux-foundation-easycla

linux-foundation-easycla Bot commented Jun 26, 2026

Copy link
Copy Markdown

CLA Signed
The committers listed above are authorized under a signed CLA.

  • ✅ login: Gasoonjia / name: gasoonjia (952e121)

@github-actions

Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…failure

The Windows CI image ships CUDA toolkits on PATH. After adding (13, 2) to
SUPPORTED_CUDA_VERSIONS (#20440), install_executorch's auto-detection
(setup.py: is_cuda_available() via nvcc) started returning True on the Windows
runner (which has the CUDA 13.2 toolkit), so it flipped EXECUTORCH_BUILD_CUDA=ON.
But the unittest jobs install CPU torch, so the resulting CUDA build of
_portable_lib fails to load its CUDA DLLs at import time:

  ImportError: DLL load failed while importing _portable_lib

causing all pytest collection to error out (unittest / unittest-editable /
unittest-release on windows).

Add a -cpuOnly switch to setup-windows.ps1 that forces
-DEXECUTORCH_BUILD_CUDA=OFF via CMAKE_ARGS, and pass it from the CPU unittest
workflow. The CUDA Windows jobs (cuda-windows.yml) keep the default and are
unaffected.
…LL load failure

Same root cause as the unittest fix in this PR, second site. The Windows wheel
build (build-wheels-windows.yml -> .ci/scripts/wheel/) does not go through
setup-windows.ps1. The Windows CI image has the CUDA 13.2 toolkit on PATH, so
after #20440 added (13, 2) to SUPPORTED_CUDA_VERSIONS, install_executorch's
auto-detection enables EXECUTORCH_BUILD_CUDA and bakes a CUDA _portable_lib +
aoti_cuda_shims.lib into the CPU wheel. The smoke test then fails with:

  ImportError: DLL load failed while importing _portable_lib

Windows wheels are CPU-only (with-cuda: disabled), so force
-DEXECUTORCH_BUILD_CUDA=OFF via CMAKE_ARGS in pre_build_script.sh on Windows.
@digantdesai

Copy link
Copy Markdown
Contributor
  • Arm Linux Wheels - preexisting
  • unittest sam export - transient - rerunning it

@digantdesai digantdesai merged commit 16ecb3f into main Jun 26, 2026
603 of 607 checks passed
@digantdesai digantdesai deleted the fix-windows-cuda-autodetect branch June 26, 2026 15:00
Reubend added a commit to Reubend/executorch that referenced this pull request Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants