Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add virtualenv seeder plugin to install build #435

Closed
wants to merge 1 commit into from

Conversation

tiran
Copy link
Collaborator

@tiran tiran commented Sep 19, 2024

Implements a seeder plugin that extends the seed function of virtualenv. The plugin allows us to seed the build package into a new virtual env and work around missing setuptools and wheel commands in Python 3.12+ virtual envs.

The seeder plugin uses bundled wheel files just like virtualenv and ensurepip.

Related: #126

sys.executable,
"-m",
"virtualenv",
"--download",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always want to use --download? Downstream we're relying on the system package for virtualenv and trying to only install tools from our build server for secure builds.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The virtualenv package bundles wheels in its distribution. To get rid of downloads, we would have to include a build wheel in the Fromager wheel. Internally, virtualenv downloads with pip download. We can redirect download to our mirror by setting an env var.

$ find virtualenv -name '*.whl'
virtualenv/seed/wheels/embed/pip-24.2-py3-none-any.whl
virtualenv/seed/wheels/embed/setuptools-68.0.0-py3-none-any.whl
virtualenv/seed/wheels/embed/pip-24.0-py3-none-any.whl
virtualenv/seed/wheels/embed/wheel-0.42.0-py3-none-any.whl
virtualenv/seed/wheels/embed/wheel-0.44.0-py3-none-any.whl
virtualenv/seed/wheels/embed/setuptools-74.1.2-py3-none-any.whl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, if we can redirect the downloads we should do that downstream and document how upstream.

I'm not sure how this implementation is technically better than just creating the environment and installing build into it ourselves like we do with the other tools. Why do it this way with the plugin?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Virtualenv uses some tricks to make installation of seed packages faster. I have created #436.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After more experimentation, I came to the conclusion that it is easier to use the same bundling approach as CPython's ensurepip and virtualenv. They store wheel files in git and ship them to their users.

The wheel files for build, packaging, and pyproject_hooks are small and rarely change. Each project has 1-3 releases per year.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For setuptools and wheel are we going to end up relying on the wheels that are shipped by virtualenv instead of using the ones that we built?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have always relied on setuptools and wheel package from virtualenv for the initial seeding of a virtualenv.

  • virtualenv and Python <3.12 seeds a venv with pip, setuptools, and wheel
  • virtualenv and Python >=3.12 seeds a venv with pip
  • venv (from CPython) and Python <3.12 seeds a venv with pip and setuptools
  • venv and Python >=3.12 seeds a venv with pip only

Implements a seeder plugin that extends the seed function of virtualenv.
The plugin allows us to seed the `build` package into a new virtual env
and work around missing `setuptools` and `wheel` commands in Python
3.12+ virtual envs.

The seeder plugin uses bundled wheel files just like `virtualenv` and
`ensurepip`.

Related: python-wheel-build#126
Signed-off-by: Christian Heimes <[email protected]>
@shubhbapna
Copy link
Collaborator

An alternate idea to not to rely on wheels shipped by fromager or virtualenv:

The docs suggest that for customizing embedded wheels we can patch the module virtualenv.seed.wheels.embed, making sure to provide the function get_embed_wheel (which returns the wheel to use given a distribution/python version).

So we have 2 possible options to get the wheels for packages we want seeded

  • Define a new option something like --virtualenv-embed-wheels-index to which the user can pass their private pypi index from which we can download build, packaging, wheel, setuptools and pyproject_hooks

or

  • Like @tiran suggested here, fromager will have to do a special bootstrap for the packages we want seeded and build those wheels first.

We can probably do a combination of both: for the first time the special bootstrap is run to get the seed wheels and they are uploaded to the private index. Then for subsequent bootstraps they can use their private index directly without having to build them

Then once we have the wheels we can:

  1. patch the get_embed_wheel function such that it returns the path to our downloaded/built wheels
  2. Since we are setting this attr of the packages we want seeded here, these packages should be picked up here in virtualenv which is then passed to get_embed_wheel here which will return the path of the wheel we had downloaded.

Moreover instead of hardcoding the packages we want seeded, we can pass it as an option in global settings and use that

@dhellmann
Copy link
Member

We have a step in our downstream processes where we copy fromager releases and other tools into the tool index. If we use RHEL system packages to download and build those wheels, we limit the exposure we have and can rely on the existing security of the RHEL packages.

We can use fromager to do the bootstrap and give us a build order file, then build those wheels ourselves without fromager one time using a trusted tool chain. At that point we have a version of fromager we trust, and we can use it to build the next version of fromager and any other tools we need.

Then when we run fromager to build product wheels, we can have it pull in trusted tool wheels.

@shubhbapna
Copy link
Collaborator

We can use fromager to do the bootstrap and give us a build order file, then build those wheels ourselves without fromager one time using a trusted tool chain. At that point we have a version of fromager we trust, and we can use it to build the next version of fromager and any other tools we need.

Oh is this as an alternate option instead of the special bootstrap for the seed packages?

@tiran
Copy link
Collaborator Author

tiran commented Sep 20, 2024

At bare minimum, we have to rely on the pip seed package from virtualenv or venv. Otherwise we don't have a way to install anything in the virtual env.

@tiran tiran marked this pull request as draft September 21, 2024 10:56
@tiran tiran requested a review from shubhbapna September 23, 2024 07:37
@tiran
Copy link
Collaborator Author

tiran commented Oct 15, 2024

We no longer need the seeder plugin. Instead I'm going to install build in the InstructLab env and use its Python API to run build with a different interpreter.

@tiran tiran closed this Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants