-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't detect copies/renames in restore-mtime #72
base: main
Are you sure you want to change the base?
Conversation
The big benefit of this is in a [blobless clone], where it lets you avoid downloading all the blobs (which you shouldn't need just to know _whether_ a file changed). But it's probably a perf win always: it makes `git log` do a bit less work. (And it should always be at least as correct: for the purposes of mtime restoration, a move/copy should count as a modification -- your build tool should assume the file must be rebuilt in its new location.) [blobless clone]: https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/ I tested this by: - make a clone of a large repo (`git clone --filter=blob:none ...`) - run `git restore-mtime` in the repo; after this change it takes ~10s Before this change, in a blobless repo, it would loop for a very long time with periodic log lines indicating it was fetching more blobs: ``` remote: Enumerating objects: 2, done. remote: Counting objects: 100% (1/1), done. Receiving objects: 100% (2/2), 6.75 KiB | 2.25 MiB/s, done. remote: Total 2 (delta 0), reused 0 (delta 0), pack-reused 1 ```
That's a very interesting rationale, one which I've never considered. Actually, I was planning a major change for the next version in the opposite direction: to make So I'm quite on the fence now... should Does any interested party have an opinion on this? How should |
I don't see why making |
FWIW I agree it's less accurate in the sense that it's inconsistent with the behavior of the With that said, I do think the behavior differences are likely very small for most repos; my experience is moves/copies with no changes are a very small fraction of file-changes although I'm sure it depends on the language in question. So the perf win may be a better default even aside from callers who prefer one way or the other. (Although it's possible setting the rename modification threshold to zero suffices for the perf win with the same behavior? I haven't looked into whether that allows git to just compare blob shas without looking at the blob itself.) [1] As far as I can see POSIX doesn't say this explicitly, but it seems to be a de facto standard. |
This The great point Ben made is that perhaps this "real" timestamp, while more "correct", might be not as useful for the intended use and audience of |
This seems to be a very close parallel to the issue I just noted in #77. In the case of merges there may not be a rename or copy. There may or may not also be local changes to the file (e.g. merge conflict resolution) in the merge commit, but it's getting entirely passed over. The result is a less accurate time stamp: it is not the timestamp of the last point the file is known to have changed in history. Whether there is a use case for the other dates I don't know because my use case is decidedly weighted towards coping with build systems. For anybody else with a need for dates targeted at builds, my competitor tool (that I wrote before I heard about this project) |
The big benefit of this is in a blobless clone, where it lets you avoid downloading all the blobs (which you shouldn't need just to know whether a file changed). But it's probably a perf win always: it makes
git log
do a bit less work. (And it should always be at least as correct: for the purposes of mtime restoration, a move/copy should count as a modification -- your build tool should assume the file must be rebuilt in its new location.)I tested this by:
git clone --filter=blob:none ...
)git restore-mtime
in the repo; after this change it takes ~10sBefore this change, in a blobless repo, it would loop for a very long time with periodic log lines indicating it was fetching more blobs: