fix: preserve connection on read timeout in recv_header#101
Open
sbaldis wants to merge 1 commit into
Open
Conversation
recv_header() is cancel-safe (HeaderState reads one byte at a time and preserves partial progress). When the transport returns ErrorKind::TimedOut, the connection can safely resume on the next call. Previously, all errors from header reads went through handle_rx(), which unconditionally calls close_with() — terminating the connection state machine. This meant a simple read timeout (e.g., from SO_RCVTIMEO) would kill the connection, requiring a full reconnect cycle (TCP + CONNECT + SUBSCRIBE). This change intercepts TimedOut errors in recv_header() before they reach handle_rx(), returning the error without terminating the connection. The fix is scoped to recv_header only — recv_body is NOT cancel-safe and still goes through handle_rx() on all errors. Includes a test verifying that recv_header() returns the timeout error but the connection remains usable for subsequent reads.
Collaborator
|
Hey! Thanks for the PR. Regarding the issue, I have a few questions and I'm not sure whether this is actually a problem:
Regarding the implementation:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
recv_header()delegates all errors tohandle_rx(), which unconditionally callsclose_with()— terminating the connection state machine. This includesNetwork(ErrorKind::TimedOut), even though a read timeout is a recoverable condition.In
stdenvironments, transports commonly useSO_RCVTIMEO(socket read timeout) to implement bounded polling. When the timeout fires, the transport returnsErrorKind::TimedOut. The current behavior kills the connection, requiring a full reconnect cycle (TCP + CONNECT + SUBSCRIBE) even though no actual I/O failure occurred.Fix
Intercept
ErrorKind::TimedOutinrecv_header()before it reacheshandle_rx(). The error is still returned to the caller, but the connection remains inNetState::Okand can resume receiving on the next call.This is safe because
recv_header()is cancel-safe —HeaderStatereads one byte at a time and preserves partial progress across interruptions. A read timeout is semantically equivalent to the cancellation thatHeaderStatealready handles.The fix is scoped to
recv_header()only.recv_body()is explicitly not cancel-safe, so its errors still go throughhandle_rx()on all error kinds.Test
Adds
recv_header_timeout_preserves_connection:ErrorKind::TimedOuton the first readrecv_header()returnsRawError::Network(TimedOut)recv_header()succeedsAll 104 existing tests continue to pass.