Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pytest failure when running out-of-tree plugin against 2.25.0a0 - macOS wheels include single architecture native_engine.so files #21938

Closed
tdyas opened this issue Feb 9, 2025 · 19 comments · Fixed by #21942
Milestone

Comments

@tdyas
Copy link
Contributor

tdyas commented Feb 9, 2025

Problem: Running pytest under Pants v2.25.0a0 is failing on my macOS/x86-64 system. The project under test is a out-of-tree Pants plugin and the test thus imports from the pantsbuild.pants distribution, hence why native_engine.so is being loaded in a test. (See the specific error below.)

The native_engine.so being imported appears to be for arm64 and not for x86-64:

HOST:~/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/pants/engine/internals% file native_engine.so
native_engine.so: Mach-O 64-bit dynamically linked shared library arm64

We seem to be publishing universal2 macOS wheels instead of the previous separate architecture-specific wheels.
2.24.1:

  • pantsbuild.pants-2.24.1-cp39-cp39-macosx_10_15_x86_64.whl
  • pantsbuild.pants-2.24.1-cp39-cp39-macosx_11_0_arm64.whl

2.25.0a0:

  • pantsbuild.pants-2.25.0a0-cp311-cp311-macosx_10_12_universal2.whl
  • pantsbuild.pants-2.25.0a0-cp311-cp311-macosx_11_0_universal2.whl

Unpacking the wheels, we see that the .so files in the root of the wheel are multi-architecture. Yet, the native_engine.so in the actual package source directory is one architecture only. (x86-64 for the 10.12 wheel and arm64 for the 11.0 wheel.)

./unpacked-wheel-11.0-universal2/pants/engine/internals/native_engine.so:  Mach-O 64-bit dynamically linked shared library arm64

./unpacked-wheel-11.0-universal2/native_engine.cpython-311-darwin.so:      Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit bundle x86_64] [arm64:Mach-O 64-bit bundle arm64]
./unpacked-wheel-11.0-universal2/native_engine.cpython-311-darwin.so (for architecture x86_64):	Mach-O 64-bit bundle x86_64
./unpacked-wheel-11.0-universal2/native_engine.cpython-311-darwin.so (for architecture arm64):	Mach-O 64-bit bundle arm64

./unpacked-wheel-10.12-universal2/pants/engine/internals/native_engine.so: Mach-O 64-bit dynamically linked shared library x86_64

./unpacked-wheel-10.12-universal2/native_engine.cpython-311-darwin.so:     Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit bundle x86_64] [arm64:Mach-O 64-bit bundle arm64]
./unpacked-wheel-10.12-universal2/native_engine.cpython-311-darwin.so (for architecture x86_64):	Mach-O 64-bit bundle x86_64
./unpacked-wheel-10.12-universal2/native_engine.cpython-311-darwin.so (for architecture arm64):	Mach-O 64-bit bundle arm64

The full error that I saw when trying to run my plugin's test was:

_ ERROR collecting src/python/shoalsoft/FOO/BAR_test.py _
ImportError while importing test module 'src/python/shoalsoft/FOO/BAR_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/Users/ME/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/_pytest/python.py:493: in importtestmodule
    mod = import_path(
/Users/ME/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/_pytest/pathlib.py:587: in import_path
    importlib.import_module(module_name)
/Users/ME/.pyenv/versions/3.11.11/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1204: in _gcd_import
    ???
<frozen importlib._bootstrap>:1176: in _find_and_load
    ???
<frozen importlib._bootstrap>:1147: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:690: in _load_unlocked
    ???
/Users/ME/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/_pytest/assertion/rewrite.py:184: in exec_module
    exec(co, module.__dict__)
src/python/shoalsoft/FOO/BAR_test.py:17: in <module>
    from pants.engine.rules import QueryRule
/Users/ME/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/pants/engine/rules.py:32: in <module>
    from pants.engine.engine_aware import SideEffecting
/Users/ME/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/pants/engine/engine_aware.py:9: in <module>
    from pants.engine.internals import native_engine
E   ImportError: dlopen(/Users/ME/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/pants/engine/internals/native_engine.so, 0x0002): tried: '/Users/ME/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/pants/engine/internals/native_engine.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64h' or 'x86_64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/tdyas/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/pants/engine/internals/native_engine.so' (no such file), '/Users/tdyas/.cache/pants/named_caches/pex_root/venvs/1/s/79b3222f/venv/lib/python3.11/site-packages/pants/engine/internals/native_engine.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64h' or 'x86_64')), '/Users/tdyas/.cache/pants/named_caches/pex_root/venvs/1/9d4bc213af4e92b06fa120abcea29d2f03f0eeef/2982e4bb26bb389cf345ae6b2c01416ccd8d4126/lib/python3.11/site-packages/pants/engine/internals/native_engine.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64h' or 'x86_64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/tdyas/.cache/pants/named_caches/pex_root/venvs/1/9d4bc213af4e92b06fa120abcea29d2f03f0eeef/2982e4bb26bb389cf345ae6b2c01416ccd8d4126/lib/python3.11/site-packages/pants/engine/internals/native_engine.so' (no such file), '/Users/tdyas/.cache/pants/named_caches/pex_root/venvs/1/9d4bc213af4e92b06fa120abcea29d2f03f0eeef/2982e4bb26bb389cf345ae6b2c01416ccd8d4126/lib/python3.11/site-packages/pants/engine/internals/native_engine.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64h' or 'x86_64'))
- generated xml file: src.python.shoalsoft.FOO.BAR_test.py.tests@@parametrize=pants-2.25.x.xml -
=========================== short test summary info ============================
ERROR src/python/shoalsoft/FOO/BAR_test.py
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
=============================== 1 error in 0.25s ===============================
@tdyas tdyas added this to the 2.25.x milestone Feb 9, 2025
@huonw
Copy link
Contributor

huonw commented Feb 9, 2025

Weird symptoms! Can you try "bisecting" on the various 2.25 dev releases to see if we can narrow down when it changed?

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

@huonw: The change seems to be in play since at least 2.25.0.dev0 with your changes to the release process in #21655. The published artifacts switched over to the current format starting with 2.25.0.dev1 though.

2.25.0.dev0 (universal2 and arm64):

  • pantsbuild.pants-2.25.0.dev0-cp311-cp311-macosx_13_0_universal2.whl
  • pantsbuild.pants-2.25.0.dev0-cp311-cp311-macosx_14_0_arm64.whl

2.25.0.dev1 (universal2 with 10.12):

  • pantsbuild.pants-2.25.0.dev1-cp311-cp311-macosx_10_12_universal2.whl
  • pantsbuild.pants-2.25.0.dev1-cp311-cp311-macosx_11_0_universal2.whl

2.25.0.dev2 (same):

  • pantsbuild.pants-2.25.0.dev2-cp311-cp311-macosx_10_12_universal2.whl
  • pantsbuild.pants-2.25.0.dev2-cp311-cp311-macosx_11_0_universal2.whl

2.25.0.dev3 (same)

  • pantsbuild.pants-2.25.0.dev3-cp311-cp311-macosx_10_12_universal2.whl
  • pantsbuild.pants-2.25.0.dev3-cp311-cp311-macosx_11_0_universal2.whl

2.25.0.dev4 (same):

  • pantsbuild.pants-2.25.0.dev4-cp311-cp311-macosx_10_12_universal2.whl
  • pantsbuild.pants-2.25.0.dev4-cp311-cp311-macosx_11_0_universal2.whl

2.25.0.dev5 (same):

  • pantsbuild.pants-2.25.0.dev5-cp311-cp311-macosx_10_12_universal2.whl
  • pantsbuild.pants-2.25.0.dev5-cp311-cp311-macosx_11_0_universal2.whl

@huonw
Copy link
Contributor

huonw commented Feb 9, 2025

Ah, thanks. I guess the problem here is the x86-64 build is installing the apparently-compatible ...-macosx_11_0_universal2.whl wheel, which isn't actually compatible.

I agree that #21655 seems like a good theory for the PR that caused it to change, although the exact mechanism isn't obvious to me yet.

It's also suspicious to me that the arm64 wheel name didn't change in dev0, but did change in dev1. That's seems like a relevant clue 🤔

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

% git log release_2.25.0.dev0..release_2.25.0.dev1 -- src/python/pants_release
commit e2e0b0cc2523ae3996a7e892514a27ac1579e11e
Author: Benjy Weinberger <[email protected]>
Date:   Wed Dec 11 18:20:37 2024 -0800

    Install Python headers in the ARM64 wheel-building container. (#21749)

    This fixes a CI error due to our switch to Python 3.11.

The only commit modifying the release process does not seem relevant?

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

External reason: Did the set of runners change in that time frame? Or maybe a runner stalled and another (older macOS) runner ran the release job?

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

Also, maybe there were tooling changes on the newer macOS runners, which defaulted to now generating universal2 wheels?

@huonw
Copy link
Contributor

huonw commented Feb 9, 2025

dev0 build runner image info: https://github.com/pantsbuild/pants/actions/runs/11941283628/job/33285786191#step:1:9

  Version: 20241022.361
  Included Software: https://github.com/actions/runner-images/blob/macos-14-arm64/20241022.361/images/macos/macos-14-arm64-Readme.md

dev1 build runner image info: https://github.com/pantsbuild/pants/actions/runs/12289144169/job/34294166127#step:1:9

  Version: 20241202.580
  Included Software: https://github.com/actions/runner-images/blob/macos-14-arm64/20241202.580/images/macos/macos-14-arm64-Readme.md

Diffing the list of software shows maybe these are relevant: actions/runner-images@macos-14-arm64/20241022.361...macos-14-arm64/20241202.580#diff-5c04a529d3c8adf7a5f23afe544071dad1853e281c9c7b44cd8d626b6c57444d

package dev0 runner dev1 runner
clang/llvm 14.0.3 15.0.0
Apple macOS SDKs 13.3 -> 15.1 14.0 -> 15.1

Install Python headers in the ARM64 wheel-building container. (#21749)

There's a chance this might be relevant? But yeah, doesn't seem obviously related.

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

My naive thought is that the Xcode version probably matters here, although the diff you pasted is a bit ambiguous. What does 13.3 -> 15.1 mean? Was the dev0 runner using Xcode 13.3 or Xcode 15.1?

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

Or is that an SDK version number and not Xcode version?

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

Also, maybe we need to pin the x84_64 / arm64 architecture explicitly when building on macOS? (To avoid universal2 wheels appearing.)

@huonw
Copy link
Contributor

huonw commented Feb 9, 2025

My naive thought is that the Xcode version probably matters here, although the diff you pasted is a bit ambiguous. What does 13.3 -> 15.1 mean? Was the dev0 runner using Xcode 13.3 or Xcode 15.1?

Or is that an SDK version number and not Xcode version?

Sorry for the ambiguity (it looks like that "perma" link for the diff isn't very useful), it's the macOS version number from this table: https://github.com/actions/runner-images/blob/macos-14-arm64/20241202.580/images/macos/macos-14-arm64-Readme.md#installed-sdks (that's the version for the dev1 runner).

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

https://github.com/pypa/cibuildwheel/blob/3805787fe7a0476391541d834fa548a721f0ab2e/cibuildwheel/macos.py#L318-L331 is an interesting read. I wonder if the options set in there are applicable to this situation.

@huonw
Copy link
Contributor

huonw commented Feb 9, 2025

Collecting more clues: when I run pants package src/python/pants: locally, I get pantsbuild.pants-2.26.0.dev0-cp311-cp311-macosx_14_0_arm64.whl, as desired.

This on an arm64 machine running macOS 15.2, with clang 15.0.0 by default, and an install of Xcode 15.4. (This is an install that appears to be broken, because that's too old, I need to upgrade to Xcode 16.2. I guess I've got a more up-to-date version of the command-line tools installed. I dno, I've forgotten how these details fit together.)

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

On my x84-64 Macbook Pro (macOS 15.3 / Xcode 16.2), running ./pants package src/python/pants: produces dist/pantsbuild.pants-2.26.0.dev0-cp311-cp311-macosx_15_0_x86_64.whl.

@tdyas
Copy link
Contributor Author

tdyas commented Feb 9, 2025

Interestingly, cibuildwheel makes use of https://pypi.org/project/delocate/ to fix up paths and stuff in Python macOS wheels.

@huonw
Copy link
Contributor

huonw commented Feb 10, 2025

I've started some experiments in #21941 for running things within the real cloud environment.

@huonw
Copy link
Contributor

huonw commented Feb 10, 2025

Summary:

Investigation:

  1. the new CI runners have a 'single' universal Python build:
    1. file ... on the Python binary says, on both x86-64 and arm64:
      /Users/runner/hostedtoolcache/Python/3.11.9/x64/bin/python: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64:Mach-O 64-bit executable arm64]
      /Users/runner/hostedtoolcache/Python/3.11.9/x64/bin/python (for architecture x86_64):	Mach-O 64-bit executable x86_64
      /Users/runner/hostedtoolcache/Python/3.11.9/x64/bin/python (for architecture arm64):	Mach-O 64-bit executable arm64
      
    2. python -c 'import sysconfig; print(sysconfig.get_platform())' says macosx-10.9-universal2 on both
  2. the old CI runners have non-universal Python builds, specialised to each architecture:
    1. file ... on the Python binary says, respectively:
      /Users/gha/.pyenv/versions/3.9.13/bin/python3.9: Mach-O 64-bit executable x86_64
      
      /Users/gha/.pyenv/versions/3.9.13/bin/python3.9: Mach-O 64-bit executable arm64
      
    2. python -c 'import sysconfig; print(sysconfig.get_platform())' says macosx-10.15-x86_64 and macosx-11.6-arm64 respectively
  3. setuptools/setup.py bdist_wheel tags the wheel based on the sysconfig.get_platform output:
    1. bdist_wheel run calls get_tag: https://github.com/pypa/setuptools/blob/a3cc40f4b8fc90e8f57e3a59a54980fb1f408207/setuptools/command/bdist_wheel.py#L406-L407
    2. ... calls get_platform: https://github.com/pypa/setuptools/blob/a3cc40f4b8fc90e8f57e3a59a54980fb1f408207/setuptools/command/bdist_wheel.py#L320
    3. ... calls sysconfig.get_platform and calculate_macos_platform_tag: https://github.com/pypa/setuptools/blob/a3cc40f4b8fc90e8f57e3a59a54980fb1f408207/setuptools/command/bdist_wheel.py#L60-L64
    4. ... calls calculate_macos_platform_tag just adjusts the version part of the sysconfig.get_platform result, leaving the macosx prefix and arm64/universal2 (etc.) suffix: https://github.com/pypa/setuptools/blob/a3cc40f4b8fc90e8f57e3a59a54980fb1f408207/setuptools/_vendor/wheel/macosx_libfile.py#L468
  4. sysconfig.get_platform reads _PYTHON_HOST_PLATFORM "... for cross builds explicitly": https://github.com/python/cpython/blob/5505b91a684b0fc7ffcb3a5b325302671d74fb15/Lib/sysconfig.py#L652-L654

Thus I think the key change that broke this is 1 vs. 2, changing the type of build of the Python interpreter, and that flowed through into wheel tagging. We can control this via _PYTHON_HOST_PLATFORM like cibuildwheel, just need to set it in the right places.

@tdyas
Copy link
Contributor Author

tdyas commented Feb 10, 2025

Thus I think the key change that broke this is 1 vs. 2, changing the type of build of the Python interpreter, and that flowed through into wheel tagging. We can control this via _PYTHON_HOST_PLATFORM like cibuildwheel, just need to set it in the right places.

Awesome investigation!

@huonw huonw closed this as completed in 2147e17 Feb 11, 2025
huonw added a commit that referenced this issue Feb 11, 2025
…21942) (#21944)

This follows the lead of `cibuildwheel` to explicitly correctly tag our
wheels on macOS, via the `_PYTHON_HOST_PLATFORM` environment variable.
This change ensures that the `pip` understands which platform each wheel
is built for and compatible with, and thus external plugins using
`pants_requirements(...)` work again.

It appears the file name (including platform) tags of a wheel are
derived from `sysconfig.get_platform()`, i.e. based on the build
configuration of the current Python interpreter. Pants builds single
platform wheels, so the wheels need to be tagged with the right CPU
architecture.

In #21655, we changed from tagging correctly to tagging incorrectly:

- Before that change, our self-hosted runners had 'simple' Python
interpreters, built for the current platform via Pyenv. `python -c
'import sysconfig; print(sysconfig.get_platform())' reports
`macosx-10.15-x86_64` and `macosx-11.6-arm64`, and indeed the wheels
were tagged like that.
- After that change, we started using Python interpreters provided by
GitHub-hosted runners. The interpreters report `macosx-10.9-universal2`
on both the ARM64 and x86-64 runners. That is, they're built as
universal binaries that can run on either architecture. This lead to the
wheels being tagged as `universal2` as well.

The `_PYTHON_HOST_PLATFORM` environment variable overrides the default
`sysconfig.get_platform()` value, and appears to be designed for this
sort of purpose (and is used by `cibuildwheel` for such):

-
https://github.com/python/cpython/blob/5505b91a684b0fc7ffcb3a5b325302671d74fb15/Lib/sysconfig.py#L652-L654
-
https://github.com/pypa/cibuildwheel/blob/3805787fe7a0476391541d834fa548a721f0ab2e/cibuildwheel/macos.py#L318-L331

Fixes #21938

Co-authored-by: Huon Wilson <[email protected]>
@huonw
Copy link
Contributor

huonw commented Feb 11, 2025

https://github.com/pantsbuild/pants/releases/tag/release_2.26.0.dev1 looks to have correct wheels:

  • pantsbuild.pants-2.26.0.dev1-cp311-cp311-macosx_13_0_x86_64.whl
  • pantsbuild.pants-2.26.0.dev1-cp311-cp311-macosx_14_0_arm64.whl

Yay. Thanks for finding the cibuildwheel/_PYTHON_HOST_PLATFORM tidbit!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants