Skip to content

Conversation

@dzbarsky
Copy link
Contributor

@dzbarsky dzbarsky commented Oct 17, 2025

While here, we tweak the naming to avoid potential name collisions from URLs with the same suffixes

@github-actions github-actions bot added the awaiting-review PR is awaiting review from an assigned reviewer label Oct 17, 2025
@dzbarsky dzbarsky force-pushed the zbarsky/remote-patches branch 2 times, most recently from 808ca9c to d9d4670 Compare October 17, 2025 19:20
name = patch_url.split("/")[-1]
patch_path = ctx.path(_REMOTE_PATCH_DIR).get_child(name)
download_info = ctx.download(
patch_path = ctx.path(_REMOTE_PATCH_DIR).get_child(str(hash(patch_url)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash function is of low quality, so how about we concatenate name and hash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough, I also considered patch_url.replace("/", "_") but was worried about Windows path lengths...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why couldn't we keep using just the name? It should be unique since remote_patches is a dict?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just the basename of the URL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, those basenames should be unique per repo right? If we just keep them, it'll be a bit easier to debug when the patch doesn't apply? I don't feel strongly against appending a hash to the downloaded file name, but I don't think it's really necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, if you had patches example.com/foo/file.patch and example.com/bar/file.patch, they would result in the same file.patch filename, but since they were downloaded and applied sequentially it was fine. Now we will have both downloads racing writes to the same file, resulting in non-deterministic patching (or maybe even corrupting the file). It's low probability but seems like an annoying-enough experience that we can trivially avoid

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, I was confused with how we specify patches in the BCR, this is a more low level feature in http_archive which might have this problem, sorry for the noise.

@fmeum fmeum requested a review from meteorcloudy October 17, 2025 19:30
@dzbarsky dzbarsky force-pushed the zbarsky/remote-patches branch from d9d4670 to 6e0bbc2 Compare October 20, 2025 13:10
@meteorcloudy
Copy link
Member

Oh, btw, can you please add a test for the name conflicting case?

@iancha1992 iancha1992 added the team-Remote-Exec Issues and PRs for the Execution (Remote) team label Oct 21, 2025
@meisterT meisterT added awaiting-user-response Awaiting a response from the author team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. and removed team-Remote-Exec Issues and PRs for the Execution (Remote) team labels Oct 27, 2025
@meteorcloudy meteorcloudy removed the awaiting-review PR is awaiting review from an assigned reviewer label Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-user-response Awaiting a response from the author team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants