-
-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gix-date: switch from time
to jiff
#1474
Conversation
This should make the swap from `time` to `jiff` easier. This comment[1] indicates that it's okay for `time` to be a public dependency, but since this patch series is about swapping `time` for `jiff`, it seemed appropriate to take this step first. And in particular, it was *almost* already the case that `time` was a private dependency of `gix-date`. The only thing we really had to button up was the exposure of `time`'s custom formatting description language. Jiff doesn't support `time`'s custom formatting machinery and instead uses a strftime/strptime like API. We could expose that instead, but since nothing (other than a test) was actually utilizing `time`'s custom formatting machinery external to `gix-date`, I figured we might as well completely encapsulate it. [1]: GitoxideLabs#471 (comment)
The bump to Rust 1.70 (released over 1 year ago) is needed since that's Jiff current MSRV. This doesn't bump the MSRV of the broader gitoxide project, however, so this is probably wrong or incomplete.
This swaps out `time` for `jiff`. It doesn't completely remove `time` from the dependency tree. The last remaining use of `time` is in `prodash`, outside of the gitoxide project.
Thanks so much 🙏!
When I saw the email this morning I just thought: Wow, as if he knew I wanted that :D! Thus I am very glad the person best suitable for the job performed the transition, and that it tought
Your patch will be warmly welcomed, I have had plans to migrate off
❤️
Actually, it's not a caveat anymore as I effectively follow Let me make the remaining changes to this PR so I can merge it. |
One more note on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is still fresh in memory, having read the in-code comments I wonder if it's worth improving the error handling in a follow-up PR.
Some of this error handling was motivated by a fuzzer which triggered panics in time
, and if the test-suite passes even without the special handling, it should all be fine.
And as gix-date
is Google-OSS-fuzzed, I will be curious if it will find new ways of crashing parsing.
// This was an error case in a previous version of this code, where | ||
// it would fail when converting from a negative signed integer | ||
// to an unsigned integer. This preserves that failure case even | ||
// though the code below handles it okay. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember that some tests were indeed quite time
dependent, while some code also tried to workaround panics in time
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I noticed that because I was intensely curious as to why there was a catch_unwind
here deep inside datetime parsing. And yeah indeed, I believe time
's panic comes from its From<SystemTime>
impl, which in turn calls OffsetDateTime
's Add
impl that specifically panics. The Add
impl panicking makes sense, since that's consistent with what std
does too. But the From
impl should probably be a TryFrom
impl. Since it's a From
impl, time
's behavior here is pretty much locked into its API. With that said, I think you probably could have avoided the catch_unwind
here by first converting now
to a time::OffsetDateTime
and then doing your arithmetic, since time
exposes checked operations on time::OffsetDateTime
. With Jiff, you can't mess this up because it doesn't provide a From<SystemTime>
impl. Only TryFrom<SystemTime>
.
// time zone here. If we did, then it would implement "1 day | ||
// ago" correctly, even when it crosses DST transitions. Since | ||
// we're in the UTC time zone here, which has no DST, 1 day is | ||
// in practice always 24 hours. ---AG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to see this becoming more correct, and likely even more correct than Git itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I definitely don't have the context to be fully confident in such a change here. It depends on your test coverage I think. But the idea here is that you'd do ts.to_zoned(TimeZone::system())
, and then zdt.checked_sub(span)
would become DST aware according to the end user's configured time zone.
"minute" => Span::new().try_minutes(units), | ||
"hour" => Span::new().try_hours(units), | ||
"day" => Span::new().try_days(units), | ||
"week" => Span::new().try_weeks(units), | ||
// TODO months & years? YES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that would easily be supported now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. It would just be Span::new().try_months
and then Span::new().try_years
. Then everything else should just work.
[hour]:[minute]:[second] \ | ||
[offset_hour sign:mandatory][offset_minute]" | ||
)); | ||
pub const GIT_RFC2822: CustomFormat = CustomFormat("%a, %-d %b %Y %H:%M:%S %z"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This crate finally comes full circle :): initially time
would have a format string like this, then it moved to format_description!
with more verbose declarations and optimized efficiency. Now we go back to simple format strings which are probably just the right choice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah! Although Jiff's strtime
APIs are quite fast. At least in Jiff's benchmarks, they're faster than time
's format_description!
parsers.
But yeah I don't really like the strtime
APIs myself, but they are handy and it seems like almost everyone is familiar with them.
.unwrap_or(0); | ||
let zdt = jiff::Zoned::now(); | ||
let seconds = zdt.timestamp().as_second(); | ||
let offset = zdt.offset().seconds(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, this cannot fail!
But… how can that be as so much can go wrong even when using the system TZ database?
The answer must be that internally it ignores failures and falls back to UTC automatically (by not having a TZ database if it could not be read). If that's the case, I wonder if there is a way to explicitly initialize the TZ database - gix
(CLI) might be interested in that in case it performs operations for which displaying or using (i.e. when creating commits) the correct local time is paramount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct that Jiff will automatically fall back to UTC. This is actually what time
does as well, courtesy of libc
. Try, for example, sudo rm /etc/localtime
and then date
on your system. Things don't blow up. They just fall back to UTC. Hah.
In terms of fallible APIs, I think it makes sense for TimeZone::try_system
to exist: BurntSushi/jiff#65
For the tzdb itself, I'd have to think about that.
Note though that even today, Jiff will emit copious log messages when it fails to find a tzdb or the system configured time zone. So if "end user should enable debug logs for gix" is a workflow that exists, then you should already get pretty detailed information when Jiff fails.
let expected = Zoned::try_from(now) | ||
.unwrap() | ||
// account for the loss of precision when creating `Time` with seconds | ||
.round( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this!
#[allow(unsafe_code)] | ||
unsafe { | ||
// SAFETY: we don't manipulate the environment from any thread | ||
time::util::local_offset::set_soundness(time::util::local_offset::Soundness::Unsound); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hiya! Sorry about the late reply. I finally had a chance to pop my stack enough that I could swing back around here. Awesome to see the PR already merged! w00t!
One more note on jiff: What I like about time is its flexible parsing, and as git essentially comes with its own equally flexible (and probably buggy) date parser, that was always something I hoped to get from the time-library I use. Thus far, I didn't try to achieve feature-parity with this, but when that's happening I'd ping you to see if or what of that should be contributed to jiff itself.
@sharkdp also mentioned wanting something like this. Basically a, "just let me throw a datetime at your library and have it try a whole bunch of different formats to parse it." Although maybe that's different than what you're asking for, since you want to specifically target what git
supports. IDK if it should be a goal for Jiff to "support what git
does" specifically though. I think the main problem with this sort of thing is that it could be very difficult to offer it without it having footguns. On the other hand, it's probably better for Jiff to do it if the alternative is ~everyone else just rolling their own ad hoc version of it. Anyway, please do reach out when you work on it.
// This was an error case in a previous version of this code, where | ||
// it would fail when converting from a negative signed integer | ||
// to an unsigned integer. This preserves that failure case even | ||
// though the code below handles it okay. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I noticed that because I was intensely curious as to why there was a catch_unwind
here deep inside datetime parsing. And yeah indeed, I believe time
's panic comes from its From<SystemTime>
impl, which in turn calls OffsetDateTime
's Add
impl that specifically panics. The Add
impl panicking makes sense, since that's consistent with what std
does too. But the From
impl should probably be a TryFrom
impl. Since it's a From
impl, time
's behavior here is pretty much locked into its API. With that said, I think you probably could have avoided the catch_unwind
here by first converting now
to a time::OffsetDateTime
and then doing your arithmetic, since time
exposes checked operations on time::OffsetDateTime
. With Jiff, you can't mess this up because it doesn't provide a From<SystemTime>
impl. Only TryFrom<SystemTime>
.
// time zone here. If we did, then it would implement "1 day | ||
// ago" correctly, even when it crosses DST transitions. Since | ||
// we're in the UTC time zone here, which has no DST, 1 day is | ||
// in practice always 24 hours. ---AG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I definitely don't have the context to be fully confident in such a change here. It depends on your test coverage I think. But the idea here is that you'd do ts.to_zoned(TimeZone::system())
, and then zdt.checked_sub(span)
would become DST aware according to the end user's configured time zone.
"minute" => Span::new().try_minutes(units), | ||
"hour" => Span::new().try_hours(units), | ||
"day" => Span::new().try_days(units), | ||
"week" => Span::new().try_weeks(units), | ||
// TODO months & years? YES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. It would just be Span::new().try_months
and then Span::new().try_years
. Then everything else should just work.
[hour]:[minute]:[second] \ | ||
[offset_hour sign:mandatory][offset_minute]" | ||
)); | ||
pub const GIT_RFC2822: CustomFormat = CustomFormat("%a, %-d %b %Y %H:%M:%S %z"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah! Although Jiff's strtime
APIs are quite fast. At least in Jiff's benchmarks, they're faster than time
's format_description!
parsers.
But yeah I don't really like the strtime
APIs myself, but they are handy and it seems like almost everyone is familiar with them.
.unwrap_or(0); | ||
let zdt = jiff::Zoned::now(); | ||
let seconds = zdt.timestamp().as_second(); | ||
let offset = zdt.offset().seconds(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct that Jiff will automatically fall back to UTC. This is actually what time
does as well, courtesy of libc
. Try, for example, sudo rm /etc/localtime
and then date
on your system. Things don't blow up. They just fall back to UTC. Hah.
In terms of fallible APIs, I think it makes sense for TimeZone::try_system
to exist: BurntSushi/jiff#65
For the tzdb itself, I'd have to think about that.
Note though that even today, Jiff will emit copious log messages when it fails to find a tzdb or the system configured time zone. So if "end user should enable debug logs for gix" is a workflow that exists, then you should already get pretty detailed information when Jiff fails.
Thanks for your responses, much appreciated!
My stance here is that it's good to be more correct than Git if that's an option, especially if that would be less surprising. Test-coverage around this probably isn't great, but if you know what correctness would mean, you'd probably put down a test to validate it. Having that would also make it possible to see if Git is also correct, just for fun - I wouldn't be completely surprised if Git would have managed to roll its own time-handling and handle all the edge-cases.
I should have known just by looking at the author name :D!
Ah, that's great! If And finally, here is a quote of mine:
The truth is that two bugs weren't reproducible anymore and closed automatically since the introduction of |
Wow. |
This extends `gix_date::parse::relative::span` to recognize times with "months" and "years", as in "3 months ago" and "2 years ago". The actual change here is very simple, because it is directly facilitated by GitoxideLabs#1474. The units "seconds", "minutes", "hours", "days", and "weeks" continue to be recognized (as before). Relative times like "1 year, 2 months ago" remain unrecognized. For background, see 43b6c06 (GitoxideLabs#498), c5c6bf6, GitoxideLabs#1474 (comment), and GitoxideLabs#1696 (comment). Note that this change does not fix issue GitoxideLabs#1696, which continues to apply to the intepretation of "days" and "weeks", and now also applies in the same way to the interpretation of "months" and "years". (It continues not to apply to the interpretation of "seconds", "minutes", and "hours".) The tests are updated to cover "months" and "years", as well as to exercise a wider range of relative times, including showing which units (e.g. "days") are affected by GitoxideLabs#1696. A few sub-cases, of those adjusted or added here, test strings from a related test in Git, to allow comparison. But most are not related to cases there. As before, the tests pass on CI, or when the time zone is otherwise UTC, or with local time zones that never have DST adjustments. The sub-cases are reordered to be monotone increasing in the magnitude of the relative intervals tested (and thus monotone decreasing in the associated absolute timestamps), and the original input is kept zipped into the `actual` and `expected` arrays being compared. This way, if the assertion fails (such as due to GitoxideLabs#1696), it is clear and readable which sub-cases failed and exactly how, as well as what all the sub-cases are and what each sub-case tests. Reordering the sub-cases and preserving the inputs they parse in this way avoids the disadvantages of GitoxideLabs#1697 while keeping, and expanding on, its benefits. Failure, where it occurs, now looks like: --- STDERR: gix-date::date time::parse::relative::various --- thread 'time::parse::relative::various' panicked at gix-date\tests\time\parse.rs:252:9: assertion failed: `(left == right)`: relative times differ Diff < left / right > : [ ( "5 seconds ago", 2024-11-24T23:51:49Z, ), ( "5 minutes ago", 2024-11-24T23:46:54Z, ), ( "5 hours ago", 2024-11-24T18:51:54Z, ), ( "5 days ago", 2024-11-19T23:51:54Z, ), ( "3 weeks ago", 2024-11-03T23:51:54Z, ), ( "21 days ago", 2024-11-03T23:51:54Z, ), ( "504 hours ago", 2024-11-03T23:51:54Z, ), ( "30240 minutes ago", 2024-11-03T23:51:54Z, ), ( "2 months ago", < 2024-09-24T23:51:54Z, > 2024-09-24T22:51:54Z, ), ( "1460 hours ago", 2024-09-25T03:51:54Z, ), ( "87600 minutes ago", 2024-09-25T03:51:54Z, ), ( "14 weeks ago", < 2024-08-18T23:51:54Z, > 2024-08-18T22:51:54Z, ), ( "98 days ago", < 2024-08-18T23:51:54Z, > 2024-08-18T22:51:54Z, ), ( "2352 hours ago", 2024-08-18T23:51:54Z, ), ( "141120 minutes ago", 2024-08-18T23:51:54Z, ), ( "5 months ago", < 2024-06-24T23:51:54Z, > 2024-06-24T22:51:54Z, ), ( "3650 hours ago", 2024-06-25T21:51:54Z, ), ( "219000 minutes ago", 2024-06-25T21:51:54Z, ), ( "26 weeks ago", < 2024-05-26T23:51:54Z, > 2024-05-26T22:51:54Z, ), ( "182 days ago", < 2024-05-26T23:51:54Z, > 2024-05-26T22:51:54Z, ), ( "4368 hours ago", 2024-05-26T23:51:54Z, ), ( "262080 minutes ago", 2024-05-26T23:51:54Z, ), ( "8 months ago", < 2024-03-24T23:51:54Z, > 2024-03-24T22:51:54Z, ), ( "5840 hours ago", 2024-03-26T15:51:54Z, ), ( "350400 minutes ago", 2024-03-26T15:51:54Z, ), ( "38 weeks ago", 2024-03-03T23:51:54Z, ), ( "266 days ago", 2024-03-03T23:51:54Z, ), ( "6384 hours ago", 2024-03-03T23:51:54Z, ), ( "383040 minutes ago", 2024-03-03T23:51:54Z, ), ( "11 months ago", 2023-12-24T23:51:54Z, ), ( "8030 hours ago", 2023-12-26T09:51:54Z, ), ( "481800 minutes ago", 2023-12-26T09:51:54Z, ), ( "14 months ago", < 2023-09-24T23:51:54Z, > 2023-09-24T22:51:54Z, ), ( "21 months ago", 2023-02-24T23:51:54Z, ), ( "2 years ago", 2022-11-24T23:51:54Z, ), ( "20 years ago", 2004-11-24T23:51:54Z, ), ( "630720000 seconds ago", 2004-11-29T23:51:54Z, ), ]
This extends `gix_date::parse::relative::span` to recognize times with "months" and "years", as in "3 months ago" and "2 years ago". Those units are supported by Git. The actual change here to the parsing code is very simple, because it is directly facilitated by GitoxideLabs#1474. The units "seconds", "minutes", "hours", "days", and "weeks" continue to be recognized (as before). Relative times like "1 year, 2 months ago" remain unrecognized. For background, see 43b6c06 (GitoxideLabs#498), c5c6bf6, GitoxideLabs#1474 (comment), and GitoxideLabs#1696 (comment). Note that this specific change does not itself fix issue GitoxideLabs#1696, which applies to the intepretation of "days" and "weeks", and now also applies in the same way to the interpretation of "months" and "years". (It continues not to apply to the interpretation of "seconds", "minutes", and "hours".) The tests are updated to cover "months" and "years", as well as to exercise a wider range of relative times, including showing which units (e.g. "days") are affected by GitoxideLabs#1696. A few sub-cases, of those adjusted or added here, test strings from a related test in Git, to allow comparison. But most are not related to cases there. As before, the tests pass on CI, or when the time zone is otherwise UTC, or with local time zones that never have DST adjustments. The sub-cases are reordered to be monotone increasing in the magnitude of the relative intervals tested (and thus monotone decreasing in the associated absolute timestamps), and the original input is kept zipped into the `actual` and `expected` arrays being compared. This way, if the assertion fails (such as due to GitoxideLabs#1696), it is clear and readable which sub-cases failed and exactly how, as well as what all the sub-cases are and what each sub-case tests. Reordering the sub-cases and preserving the inputs they parse in this way avoids the disadvantages of GitoxideLabs#1697 while keeping, and expanding on, its benefits. Failure, where it occurs, now looks like: --- STDERR: gix-date::date time::parse::relative::various --- thread 'time::parse::relative::various' panicked at gix-date\tests\time\parse.rs:252:9: assertion failed: `(left == right)`: relative times differ Diff < left / right > : [ ( "5 seconds ago", 2024-11-24T23:51:49Z, ), ( "5 minutes ago", 2024-11-24T23:46:54Z, ), ( "5 hours ago", 2024-11-24T18:51:54Z, ), ( "5 days ago", 2024-11-19T23:51:54Z, ), ( "3 weeks ago", 2024-11-03T23:51:54Z, ), ( "21 days ago", 2024-11-03T23:51:54Z, ), ( "504 hours ago", 2024-11-03T23:51:54Z, ), ( "30240 minutes ago", 2024-11-03T23:51:54Z, ), ( "2 months ago", < 2024-09-24T23:51:54Z, > 2024-09-24T22:51:54Z, ), ( "1460 hours ago", 2024-09-25T03:51:54Z, ), ( "87600 minutes ago", 2024-09-25T03:51:54Z, ), ( "14 weeks ago", < 2024-08-18T23:51:54Z, > 2024-08-18T22:51:54Z, ), ( "98 days ago", < 2024-08-18T23:51:54Z, > 2024-08-18T22:51:54Z, ), ( "2352 hours ago", 2024-08-18T23:51:54Z, ), ( "141120 minutes ago", 2024-08-18T23:51:54Z, ), ( "5 months ago", < 2024-06-24T23:51:54Z, > 2024-06-24T22:51:54Z, ), ( "3650 hours ago", 2024-06-25T21:51:54Z, ), ( "219000 minutes ago", 2024-06-25T21:51:54Z, ), ( "26 weeks ago", < 2024-05-26T23:51:54Z, > 2024-05-26T22:51:54Z, ), ( "182 days ago", < 2024-05-26T23:51:54Z, > 2024-05-26T22:51:54Z, ), ( "4368 hours ago", 2024-05-26T23:51:54Z, ), ( "262080 minutes ago", 2024-05-26T23:51:54Z, ), ( "8 months ago", < 2024-03-24T23:51:54Z, > 2024-03-24T22:51:54Z, ), ( "5840 hours ago", 2024-03-26T15:51:54Z, ), ( "350400 minutes ago", 2024-03-26T15:51:54Z, ), ( "38 weeks ago", 2024-03-03T23:51:54Z, ), ( "266 days ago", 2024-03-03T23:51:54Z, ), ( "6384 hours ago", 2024-03-03T23:51:54Z, ), ( "383040 minutes ago", 2024-03-03T23:51:54Z, ), ( "11 months ago", 2023-12-24T23:51:54Z, ), ( "8030 hours ago", 2023-12-26T09:51:54Z, ), ( "481800 minutes ago", 2023-12-26T09:51:54Z, ), ( "14 months ago", < 2023-09-24T23:51:54Z, > 2023-09-24T22:51:54Z, ), ( "21 months ago", 2023-02-24T23:51:54Z, ), ( "2 years ago", 2022-11-24T23:51:54Z, ), ( "20 years ago", 2004-11-24T23:51:54Z, ), ( "630720000 seconds ago", 2004-11-29T23:51:54Z, ), ]
Hopefully this PR isn't too forward, but I noticed you mentioned switching to
jiff
, so I figured I'd take a crack at it. The transition went pretty smoothly, although I did need to add support for skipping weekday checking in RFC 2822 style datetimes. So this ended up being a good exercise for Jiff, even if you don't take the PR.This doesn't completely remove
time
from the dependency tree though, sinceprodash
depends on it. However, I do have a patch forprodash
that makes the swap, but wanted to wait for the OK before sending it.I tried to stick to the smallest diff I could to make this transition. While doing it, I noticed at least a couple things that I think could be improved with Jiff. I think I can simplify the parsing, and I think I can also add support for month/year relative datetime parsing, since Jiff supports those out of the box.
Plus, this PR gets rid of some
unsafe
usage. :-)Main caveat: merging this will require bumping the MSRV to at least Rust 1.70, which is Jiff's MSRV. (I do intend for Jiff's MSRV to be about as conservative as
regex
's MSRV, which is usually around N-9, or one year of Rust releases.)