Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle trailing slash in ref list prefix filtering #1936

Closed

Conversation

moroten
Copy link
Contributor

@moroten moroten commented Apr 6, 2025

Previously, refs/heads/foo/bar would be listed when running repo.references()?.prefixed("refs/heads/b"). The code identified that the last component was not a directory and started to match it as a filename prefix for all files in all recursive directories, effectively matching refs/heads/**/b*.

This commit fixes that bug but also allows to use a trailing / in the prefix, allowing to filter for refs/heads/foo/ and not get refs/heads/foo-bar as a result.

Fixes #1934.

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for tackling this!

I think what would be needed here is a test that shows the problem this is solving, i.e. the iteration failing to properly deduplicate results from packed-refs and loose refs when a prefix is used.

Such an example should include deep nesting as well.

Prefixed search is necessary to help to quickly bypass a possibly large list of results, and it's essential to do quick listing (without reftable) during fetches in repositories where a lot of references exist.

Edit: I also left a comment here which may be relevant for this PR as well.

Copy link
Member

@EliahKagan EliahKagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current test failure is due to the need to commit a regenerated gix-ref/tests/fixtures/generated-archives/make_ref_repository.tar. This is needed because the script that generates it, gix-ref/tests/fixtures/make_ref_repository.sh, is modified here. To verify that this was the only cause of failure so far, I've regenerated that archive in ek/run-ci/regenerate-archives-for-pr-1936 (b6ab162).

If you like, you can fast-forward this PR to include that commit, though it might not be useful to do so if the further changes for #1936 (review) will include modifying make_ref_repository.sh again.

git fetch --all
git cherry-pick upstream/ek/run-ci/regenerate-archives-for-pr-1936

Please disregard this if it is not convenient or if make_ref_repository.sh shall be changed again.

@moroten moroten force-pushed the fix-recursive-list-refs-prefix branch from 461b374 to 23cdf31 Compare April 7, 2025 19:24
@moroten moroten changed the title List direct files only for ref name prefix Handle trailing slash in ref list prefix filtering Apr 7, 2025
@moroten
Copy link
Contributor Author

moroten commented Apr 7, 2025

Thank you @EliahKagan. It turned out that I missed the binary files when using git add -p.

@moroten
Copy link
Contributor Author

moroten commented Apr 7, 2025

I've rewritten the PR to use a BStr as prefix instead of Path. This allows for trailing slash.

Is there any use case where Windows backslash need to be translated to forward slash that I may have missed?

Previously, `refs/heads/foo/bar` would be listed when running
`repo.references()?.prefixed("refs/heads/b")`. The code identified that
the last component was not a directory and started to match it as a
filename prefix for all files in all recursive directories, effectively
matching `refs/heads/**/b*`.

This commit fixes that bug but also allows to use a trailing `/` in the
prefix, allowing to filter for `refs/heads/foo/` and not get
`refs/heads/foo-bar` as a result.

Fixes GitoxideLabs#1934.
@moroten moroten force-pushed the fix-recursive-list-refs-prefix branch from 23cdf31 to 3ca6811 Compare April 7, 2025 20:14
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for taking this PR further, it looks great.

I'd have to take it into another round to improve a couple of things that are mentioned below. Most importantly, a test is needed to show that \\ isn't special anymore. Unfortunately, there isn't a breaking that that would have tested for it being treated specifically either, so it might be that the claim was never true.

From there I can promise no more rounds, I'd take it from there with an in-IDE review, fixing whatever small things that come up myself.

If you feel like it's enough, please let me know and I will happily pick it up as soon as I can to get it merged.

pub fn prefixed(&self, prefix: &Path) -> std::io::Result<LooseThenPacked<'_, '_>> {
/// As [`iter(…)`][file::Store::iter()], but filters by `prefix`, i.e. "refs/heads/" or
/// "refs/heads/feature-".
pub fn prefixed(&self, prefix: Cow<'_, BStr>) -> std::io::Result<LooseThenPacked<'_, '_>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fair to have this breaking change, as this prefix is essentially a RepositoryPath, i.e. a slash-separated relative path.
Something I think that should come with it is tests that show that behaviour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are now more explicitly named to test that.

pub fn prefixed(&self, prefix: &Path) -> std::io::Result<LooseThenPacked<'_, '_>> {
/// As [`iter(…)`][file::Store::iter()], but filters by `prefix`, i.e. "refs/heads/" or
/// "refs/heads/feature-".
pub fn prefixed(&self, prefix: Cow<'_, BStr>) -> std::io::Result<LooseThenPacked<'_, '_>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be &BStr if i twas &Path before.

@@ -93,7 +87,7 @@ impl file::Store {
/// Return an iterator over all loose references that start with the given `prefix`.
///
/// Otherwise it's similar to [`loose_iter()`][file::Store::loose_iter()].
pub fn loose_iter_prefixed(&self, prefix: &Path) -> std::io::Result<LooseThenPacked<'_, '_>> {
pub fn loose_iter_prefixed(&self, prefix: Cow<'_, BStr>) -> std::io::Result<LooseThenPacked<'_, '_>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be &BStr if it was &Path before.
It could also be impl Into<…> if that makes it more convenient to call, even for the test-suite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put impl Into<Cow<...>> in most places. Not sure if some should just be &BStr or not.


cp .git/refs/heads/main .git/refs/remotes/origin/

echo "ref: refs/remotes/origin/main" > .git/refs/remotes/origin/HEAD
echo "ref: refs/heads/main" > .git/refs/heads-packed
echo "ref: refs/heads/main" > .git/refs/heads/sub/dir/packed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these names should be changed so packed isn't included. This is because symbolic refs are never packed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to make sure that both the loose and packed refs are handled correctly and not accidentally testing the wrong thing. Changed the names to refs/feature-suffix and refs/feature/sub/dir/algo to make the test more explicit.

@moroten moroten requested a review from Byron April 9, 2025 09:48
@Byron Byron self-assigned this Apr 13, 2025
@Byron
Copy link
Member

Byron commented Apr 13, 2025

Thanks again for tackling this!

I am closing this PR as I wasn't able to push my changes into it. Instead, I opened #1954.

@Byron Byron closed this Apr 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

refs.references().prefixed() lets too many refs through
3 participants