fix: preserve connection on read timeout in recv_header by sbaldis · Pull Request #101 · obabec/rust-mqtt

sbaldis · 2026-03-29T21:40:51Z

Problem

recv_header() delegates all errors to handle_rx(), which unconditionally calls close_with() — terminating the connection state machine. This includes Network(ErrorKind::TimedOut), even though a read timeout is a recoverable condition.

In std environments, transports commonly use SO_RCVTIMEO (socket read timeout) to implement bounded polling. When the timeout fires, the transport returns ErrorKind::TimedOut. The current behavior kills the connection, requiring a full reconnect cycle (TCP + CONNECT + SUBSCRIBE) even though no actual I/O failure occurred.

Fix

Intercept ErrorKind::TimedOut in recv_header() before it reaches handle_rx(). The error is still returned to the caller, but the connection remains in NetState::Ok and can resume receiving on the next call.

This is safe because recv_header() is cancel-safe — HeaderState reads one byte at a time and preserves partial progress across interruptions. A read timeout is semantically equivalent to the cancellation that HeaderState already handles.

The fix is scoped to recv_header() only. recv_body() is explicitly not cancel-safe, so its errors still go through handle_rx() on all error kinds.

Test

Adds recv_header_timeout_preserves_connection:

Mock transport returns ErrorKind::TimedOut on the first read
Verifies recv_header() returns RawError::Network(TimedOut)
Verifies the connection is still usable — a subsequent recv_header() succeeds

All 104 existing tests continue to pass.

recv_header() is cancel-safe (HeaderState reads one byte at a time and preserves partial progress). When the transport returns ErrorKind::TimedOut, the connection can safely resume on the next call. Previously, all errors from header reads went through handle_rx(), which unconditionally calls close_with() — terminating the connection state machine. This meant a simple read timeout (e.g., from SO_RCVTIMEO) would kill the connection, requiring a full reconnect cycle (TCP + CONNECT + SUBSCRIBE). This change intercepts TimedOut errors in recv_header() before they reach handle_rx(), returning the error without terminating the connection. The fix is scoped to recv_header only — recv_body is NOT cancel-safe and still goes through handle_rx() on all errors. Includes a test verifying that recv_header() returns the timeout error but the connection remains usable for subsequent reads.

julian-graf · 2026-03-30T07:46:18Z

Hey! Thanks for the PR.

Regarding the issue, I have a few questions and I'm not sure whether this is actually a problem:

Bounded polling can already be implemented by dropping the poll_header future. For timeouts, you can just poll the client along with a timer future and set the socket timeout to a value greater than the timer. Is this something that doesn't work in your environment? (I do agree that this is probably the easiest way to support bounded polling in a sync implementation, which is something that is planned but I can't say when it will be implemented)
According to your implementation, all occurences of a timeout ErrorKind can be treated as recoverable from the IO side, which is something I can't say for sure about all embedded_io_async implementations.
The error handling in poll_body still treats a timeout as a hard error. As a user, I think this is surprising. While this is not something that realistically happens because once a header is received, the rest of the packet should follow in time, I think this violates the principle of least surprise. This could probably be worked around in the low level calls of embedded_io_async::Read though.

Regarding the implementation:

When a timeout occurs, the client will still return MqttError::Network which is still unrecoverable in the client's error semantics. Perhaps it is worth introducing a new error variant or just adjusting the documentation and MqttError::is_recoverable implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve connection on read timeout in recv_header#101

fix: preserve connection on read timeout in recv_header#101
sbaldis wants to merge 1 commit into
obabec:mainfrom
sbaldis:fix/timeout-no-close

sbaldis commented Mar 29, 2026

Uh oh!

julian-graf commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sbaldis commented Mar 29, 2026

Problem

Fix

Test

Uh oh!

julian-graf commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants