Skip to content

Conversation

@ysbaddaden
Copy link
Collaborator

@ysbaddaden ysbaddaden commented Aug 1, 2025

Synchronization primitives, such as mutexes, condition variables, pools, or event channels, could take advantage of a general timeout mechanism.

Crystal has a mechanism in the event loop to suspend the execution of a fiber for a set amount of time (#sleep). It also has a couple mechanisms to add timeouts: one associated to IO operations, and another tailored to Channel and select to support the timeout branch of select actions.

Adding timeouts to all the synchronization primitives in the stdlib, and possibly to custom ones in shards and applications, shouldn't be much harder than calling #sleep, or need to hack into the private Fiber#timeout_event.

Preview: https://github.com/crystal-lang/rfcs/blob/general-timeouts/text/0014-cancelable-timers.md

@ysbaddaden ysbaddaden self-assigned this Aug 1, 2025
@ysbaddaden ysbaddaden changed the title RFC XXXX: General Timeouts RFC 0014: General Timeouts Aug 1, 2025
Copy link
Member

@beta-ziliani beta-ziliani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I think a use case will help in visualizing it further. Do you mind adding it?

@ysbaddaden
Copy link
Collaborator Author

@beta-ziliani Ah, I pushed a new draft between your review. I removed the implementation details to focus on the API, and added a technical example to the reference section for the mutex from the guide section.

@ysbaddaden
Copy link
Collaborator Author

And with a last polish pass to finish the DRAFT.

@ysbaddaden ysbaddaden changed the title RFC 0014: General Timeouts RFC 0014: Cancelable timers Aug 5, 2025
@ysbaddaden
Copy link
Collaborator Author

@straight-shoota I renamed the RFC to "cancelable timers", clarified the summary/motivation sections, and reorganized the sections. Is it better?

I made the rationale a distinct section, and put it between the guide and reference sections, so we flow from motivation -> explanation with example -> how to implement the timeout -> proposed API -> code example for the explanation (based on how + api).

@ysbaddaden
Copy link
Collaborator Author

ysbaddaden commented Aug 5, 2025

I included your suggestions 👍

Co-authored-by: Sijawusz Pur Rahnama <[email protected]>
@crysbot
Copy link

crysbot commented Aug 8, 2025

This pull request has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/ambiguous-use-of-time-span-span/8324/1

Copy link
Member

@beta-ziliani beta-ziliani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments and questions.

Another question: Would this affect the channel's timeout? Do you expect them to share any implementation? Will this have any implicancy if we ever make select to be channel agnostic?

@ysbaddaden
Copy link
Collaborator Author

@beta-ziliani I indeed plan to try and use it to implement the select timeout action, ideally even stop allocating a SelectContextSharedState for every select since even without a timeout action we might still be able to use the token 🤔

@ysbaddaden
Copy link
Collaborator Author

ysbaddaden commented Aug 25, 2025

Returning to this with a fresh 🧠 and following @beta-ziliani's remarks, I think we could have:

  • Fiber::CancelationToken instead of TimeoutToken so we can internally reuse the mechanism for select regardless of the presence of a timeout action;

  • Fiber.sleep(Time::Span, & : CancelationToken -> TimeoutResult) instead of .timeout;

  • ::sleep(Time::Span) becomes a shortcut for Fiber.sleep(Time::Span) { }

  • Crystal::EventLoop#sleep(Time::Span, CancelationToken) to replace the existing method instead of introducing a new #timeout method.

I'm leaving aside the absolute timers because they're relative to a specific clock and we shall solve that first.

@ysbaddaden
Copy link
Collaborator Author

Both the RFC and the implementation PR have been updated to reflect the above proposal.

@crysbot
Copy link

crysbot commented Aug 28, 2025

This pull request has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/rfc-14-cancelable-timers/8386/1

@dsisnero
Copy link

every time I think of timers for loops I always go back to this blog post. https://vorpus.org/blog/timeouts-and-cancellation-for-humans/. I think we should abstract timeout and other reasons to cancel into a cancellation token - there are other reasons to abort async code besides timeouts

@crysbot
Copy link

crysbot commented Aug 28, 2025

This pull request has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/ambiguous-use-of-time-span-for-duration-and-monotonic-clock/8324/14

@RX14
Copy link
Member

RX14 commented Sep 5, 2025

I love the idea of doing the user API for cancellation with a go-style Context or CancellationToken. That object would not have much to do with the implementation details covered in this RFC though: the "context" could be shared across a large bunch of fibers, whereas this CancellationToken is specific to one fiber sleep. With respect to this RFC I think we need to think about keeping the naming not confusing if we introduce a user-facing cancellation mechanism in the future. I can't see that a "context" object using the cancellation token to resume the fiber would break the cancellation token concept, but you might want to combine the Fiber::TimeoutResult with the status of any context-wide cancellation. Maybe there's something I'm missing though.

@RX14
Copy link
Member

RX14 commented Sep 5, 2025

Perhaps a bit more succinctly: this RFC is about providing an abstraction which provides clarity for who exactly owns the capability to resume a fiber. This is applicable to all ways of stopping a fiber operation, regardless of the user-facing API. Maybe getting a concensus on using a user-facing cancellation token would enable a better internal API, but this should be quite easy to refactor later regardless.

@ysbaddaden
Copy link
Collaborator Author

@RX14 Trying to use the mechanism in the sync shard, I realized that Fiber::TimeoutResult::CANCELED is confusing. For example returning CANCELED for a timed wait makes no sense: it either expires or normally returns, and canceled would mean that the call itself was canceled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants