Consuming from a quorum queue on a node that hosts a follower member times out with a term mismatch message #14676

malmetom · 2025-10-03T20:10:55Z

malmetom
Oct 3, 2025

Getting execptions from a client trying to consume from a quorum queue, client error state

Exception (541) Reason: "INTERNAL_ERROR - timed out consuming from quorum queue 'xxx' in vhost '/': {'%2F_yyy'}

logs on one Rabbit node state

2025-10-03 22:01:37.917907+02:00 [info] <0.863.0> queue 'xxx' in vhost '/': term mismatch - follower had entry at 658 with term 81 but not with term 82. Asking leader {'%2F_yyy','[email protected]'} to resend from 659
2025-10-03 22:02:07.919957+02:00 [info] <0.863.0> queue 'xxx' in vhost '/': term mismatch - follower had entry at 658 with term 81 but not with term 82. Asking leader {'%2F_yyy,'[email protected]'} to resend from 659
2025-10-03 22:02:11.782388+02:00 [error] <0.188296.0> Error on AMQP connection <0.188296.0> (10.20.128.0:60716 -> 10.20.222.109:5672, vhost: '/', user: 'USER', state: running), channel 5:
2025-10-03 22:02:11.782388+02:00 [error] <0.188296.0> operation basic.consume caused a connection exception internal_error: "timed out consuming from quorum queue 'xxx' in vhost '/': {'%2F_yyy',\n '[email protected]'}"
2025-10-03 22:02:11.806517+02:00 [info] <0.188296.0> closing AMQP connection (10.20.128.0:60716 -> 10.20.222.109:5672, vhost: '/', user: 'USER', duration: '2M, 15s')

This happens again and again, and seems like Rabbit cannot restore

Happens after netowork partition. Any idea if this is a potential bug or wrong use of quorum queus

Reproduction steps

Hard to say, will add steps when I have found a way to reproduce

Expected behavior

Should be able to survive network partition

Additional context

Running version 4.1.1

michaelklishin · 2025-10-03T20:59:17Z

michaelklishin
Oct 3, 2025
Maintainer

@malmetom that's not enough information for us to conclude much.

All we can tell is that a quorum queue follower is behind the leader, so it asks the leader for the delta to reinstall the (follower's) Raft log. We don't know anything about what happens on the leader.

Without full logs from all cluster nodes or a reasonably reliable way to reproduce we can only guess as to what's going on, and we do not guess in this community. Guessing is a very expensive way of troubleshooting distributed infrastructure.

This could be a different manifestation of #13101, which is a known open issue for quorum queues. #14237 and #14241 are two competing solutions that we'll get to some time after 4.2.0 and the Ra 3.0 work concluding. Anyone is welcome to test #14241, which is now again ready for review (as of less than 24 hours ago).

1 reply

malmetom Oct 4, 2025
Author

Thanks for quick response, I fully understand, sorry for bad input to the issue. I will collect logs and attach if I can't figure out the issue on my end, probably the setup causing this issue

Just one question, this is the status of the quorum,

think it looks strange that Last applied is so low but thats probably due to all partitions going on, this is a test environment, but it shouldn't be an issue to recover from this state right?

I really appreciate the support given in this community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consuming from a quorum queue on a node that hosts a follower member times out with a term mismatch message #14676

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Consuming from a quorum queue on a node that hosts a follower member times out with a term mismatch message #14676

Uh oh!

Uh oh!

malmetom Oct 3, 2025

Reproduction steps

Expected behavior

Additional context

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

michaelklishin Oct 3, 2025 Maintainer

Uh oh!

Uh oh!

malmetom Oct 4, 2025 Author

malmetom
Oct 3, 2025

Replies: 1 comment 1 reply

michaelklishin
Oct 3, 2025
Maintainer

malmetom Oct 4, 2025
Author