Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: uv lock rule instead of genrule #2657

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

aignas
Copy link
Collaborator

@aignas aignas commented Mar 11, 2025

This change implements the uv pip compile as a rule.

In order to also make things easier to debug we provide
a runnable rule that has the same arguments and updates
the source tree output file automatically.

The main design is to have a regular lock rule and then
it returns a custom provider that has all of the recipe
ingredients to construct an executable rule. The execution
depends on having bash or bat files on Windows when running
the debugging rule target.

There are integration tests that exercise the locker. However,
things that are untested:

  • Windows support - current CI Windows runners do not support
    running the uv binary. Need help from some Windows users.
  • Running the integration tests within RBE seems to not work
    but locking when using RBE still works - there is a native_test
    exercising this.
  • Supporting keyring integration to pull packages from private
    index servers. https://docs.astral.sh/uv/configuration/authentication/
    should be supported.

Work towards #1325
Work towards #1975
Related #2663

Copy link
Collaborator Author

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, after doing self review I think I want to have:

  • A script for launching uv from bash that would be compatible with UNIX.
  • A script for launching uv from powershell that would be compatible with Windows (generative AI may help with initial translation here).
  • Rewrite the internals a little to drop the Python usage (or at least most of it).

@aignas aignas force-pushed the uv-lock-rule-instead-of-genrule branch from fcc17c3 to 7ddfd28 Compare March 13, 2025 13:52
@aignas aignas changed the title uv lock rule instead of genrule feat: uv lock rule instead of genrule Mar 13, 2025
@aignas aignas force-pushed the uv-lock-rule-instead-of-genrule branch from 928f1e2 to 2c29ae2 Compare March 13, 2025 15:23
@aignas aignas marked this pull request as ready for review March 13, 2025 15:23
@aignas aignas requested a review from rickeylev as a code owner March 13, 2025 15:23
@bazel-contrib bazel-contrib deleted a comment from aignas Mar 13, 2025
@rickeylev
Copy link
Collaborator

The core of the implementation looks good to me (rule that looks up toolchains to run an build action; lock_run uses info lock provides via a provider). I have a variety of smaller comments about some particulars, but gtg now, so I'll have to wait until, ah, probably the weekend sometime.

The maybe_out behavior is interesting/clever, I just noticed that and will take a closer look.

Copy link
Collaborator

@rickeylev rickeylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got about half way through

Copy link
Collaborator

@rickeylev rickeylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, to clarify, this is the sort of interface I imagined:

  1. bazel build //:requirements generates a requirements file using a build action
  2. bazel run //:requirements.update generates a requirements file using a build action and copies it into the local client to update the requirements file
  3. bazel run //:requirements.run -- <args> more of a direct invocation; runs uv directly; this is for debugging or experimenting with settings without having to modify the BUILD file.

The reason for (2) to use the output of (1) is because that's where the magic of bazel happens. By this i mean: build actions have isolated/deterministic/hermetic capabilities, can run remotely, are more amenable to having transitions applied, and are better about not having different output per user machine.

@aignas
Copy link
Collaborator Author

aignas commented Mar 16, 2025

OK, addressed the comments, PTAL.

I see that Windows is failing, because something is not compatible with the CI Windows version. I wonder if this means we should just tell users that Windows may be unsupported?

aignas added 2 commits March 18, 2025 09:25
This change implements the uv pip compile as a rule.

In order to also make things easier to debug we provide
a runnable rule that has the same arguments and updates
the source tree output file automatically.

The main design is to have a regular lock rule and then
it returns a custom provider that has all of the recipe
ingredients to construct an executable rule. The execution
depends on having bash or powershell, however the
powershell script is not yet complete and requires some
help from the community.

Work towards bazel-contrib#1975.
add the upgrade flag back
@aignas aignas force-pushed the uv-lock-rule-instead-of-genrule branch from daab523 to c67cdcd Compare March 18, 2025 09:23
@aignas
Copy link
Collaborator Author

aignas commented Mar 19, 2025

Open questions:

  • Currently providing keyring and other authentication things need to happen via system and this has not been fully tested. This might be relevant for compile_pip_requirements does not use credential helper #2663 were users need to be able to configure credential helpers for pulling packages from internal mirrors.
  • Should we use python_toolchain or a target to the current python interpreter? Passing //python:none could act as a way to indicate that I need python from the system, i.e. python. This would mean that we could use the //python/bin:interpreter which could include dependencies used during locking (e.g. keyring and similar). Right now I am not sure how to get that wired through. Maybe we should have a few more tests that setup a mirror that needs the keyring dep to connect, otherwise it is hard to not break the behaviour, but I would like to merge this as is for now because the PR is getting big and it is hard to keep track of all of the things.

@rickeylev
Copy link
Collaborator

Sorry, some short notice deadlines came upon me. I'll be able to have another look Weds evening or after

@rickeylev
Copy link
Collaborator

Should we use python_toolchain or a target to the current python interpreter?
Passing an interpreter means additional deps (keyring) can be captured

Since uv is coming from a toolchain, and we need python to match uv, I think both should come from a toolchain. A behavior unique to toolchains is that a group of toolchains can get resolved with the same config state. Getting the same behavior using labels, or a mix of a label and toolchain, might be tricky (it might be possible using exec groups?).

return [
DefaultInfo(
executable = executable,
runfiles = ctx.runfiles(transitive_files = info.srcs),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a subtle config mismatch here: info.srcs contains uv in exec config, but here it's going to run in target config.

I'm OK with ignoring this for now, though, to keep progress moving along.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a transition, but I am still new as to how to ensure that this will be in exec configuration.

I can follow this up with a separate PR.

args.run_shell.add("--no-progress")
args.run_shell.add("--quiet")

ctx.actions.run_shell(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding why this step needs the copy step as part of its execution.

Isn't this the same behavior?

srcs = list(ctx.attr.srcs)
if existing_file:
  srcs.append(existing_file)
output = declare_file(name + ".out")
ctx.actions.run([uv, "--output={output}"] + srcs, inputs=srcs, output=output)

uv is going to overwrite whatever --output specifies, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uv expects the previous output to be in the location that is defined by --output={output}. If you pass it as a source then you will have extra log lines via requirements-existing.txt, which is not what you want here.

Comment on lines +376 to +378
# FIXME @aignas 2025-03-17: should we have one more target that transitions
# the python_version to ensure that if somebody calls `bazel build
# :requirements` that it is locked with the right `python_version`?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate target, no. An attribute with rule-level cfg transition, yes.

This also enables tricks like this:

lock(python_version="3.10", srcs=select(":py310": "requirements_310.txt", ...))

Similarly, because an attr.output is not used, the outputs can be varied to e.g. include the python version (or platform, etc) into the output file name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nudge. I have added the transition to the lock rule and I think the overall design is now simpler.

I have also created an internal expand_template rule that makes the wiring of files easier. I have added the python version to the file name, so it might be both, a good and a bad decision and we will see. :)

@aignas
Copy link
Collaborator Author

aignas commented Mar 20, 2025

RBE is failing with an error:

bazel-out/k8-opt-exec-ST-150d2d5f4ddd/bin/python/private/python3: error while loading shared libraries: /b/f/w/bazel-out/k8-opt-exec-ST-150d2d5f4ddd/bin/python/private/../lib/libpython3.11.so.1.0: cannot open shared object file: No such file or directory

Not sure exactly why this is happening because I am passing interpreter.files_to_run to the action.

I think the documentation mentioning to use .files_to_run may be wrong.

EDIT: this guess was wrong. I am fiddling with the cfg for the uv_toolchain now. I think it should be exec instead of target.

EDIT2: the uv_toolchain has nothing to do with that, because the error is coming from python not finding the .so file, which to me suggests that the files are missing, but that should not be the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants