-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix hanging requests with filtered steal #3016
Fix hanging requests with filtered steal #3016
Conversation
edb6727
to
22ce712
Compare
0c90fbf
to
7508902
Compare
Co-authored-by: meowjesty <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even moar stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally been through everything!
I have only nits, docs requests. The refactor seems to make the http stuff simpler, and that sparks joy.
Co-authored-by: meowjesty <[email protected]>
Co-authored-by: meowjesty <[email protected]>
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think there's a place with TPC
instead of TCP
, other than that
👍
So...
This started as a small refactor in order with some hope to fix the hanging requests issue. I could not find any bug that could cause the problem and only later I found out that there was no problem. The repro application I was using was handling only one connection at a time and the first HTTP connection was not closed by the k8s proxy (most probably to be reused later). And so the second request would hang on intproxy's HTTP handshake attempt. Since we want to be user friendly, this PR introduces reusing local HTTP connections, which solves the problem. However, since it started as a refactor, it's big. Sorry.
Changes summarized:
StreamingBody
was moved frommirrord-protocol
tomirrord-intproxy
without any notable changes. There was no need for it to be in the protocol crate.BodyExt
trait inmirrord-protocol
was renamed toBatchedBody
. The only notable change is moving from customFuture
implementation (FramesFut
) to usingnow_or_never
. I was afraid of it in the past, now I'm not. I tested this with heavy load and did not detect any difference. Usingnow_or_never
simplifies things, because some code no longer needs to be asyncHttpRequest<StreamingBody>
type, to remove ugly generics and match expressions.HttpRequestFallback
enum, along with lots of conversion code, was removed frommirrord-protocol
.HttpResponseFallback
type was moved to the agent without any notable changes. There was no need for it to be in the protocol crate.ReversePortForwarder
and its tests were fixed. It was never streaming responses' bodies, becauseIncomingProxy
was not notified about agent protocol version. This change is not related to the issue, but the problem came up in the CI.h2::Error::is_reset
check and the dependency onh2
completely. Instead of checking if the HTTP error is transient, we check if it's not transient (usinghyper::Error
methods, e.ghyper::Error::is_user
). I think it's simpler and safer, since retrying a request is not harmful.BoundTcpSocket
struct to the incoming proxy, which wraps logic for binding the same interface as user socket. Now we can actually see the bound socket address in tracing.ClientStore
struct that caches unused local HTTP connection and cleans them up after some timeout.HttpGatewayTask
insideIncomingProxy
. To reuse connections, they share aClientStore
instance.IncomingProxy
. Each connection is handled by its ownTcpProxyTask
. The task knows whether the connection is stolen or mirrored. If it's mirrored, the data is no longer being sent to the mainIncomingProxy
task, it is immediately discarded. If it's stolen, the connection is no longer artificially kept alive until silent for a second (this mechanism makes sense only with the mirror mode, can introduce weird behavior in steal mode).Interceptor
task removed completely, now we have two separate tasks:HttpGatewayTask
andTcpProxyTask
MetadataStore
was moved to its own module without any notable changesIncomingProxy
now optimizes HTTP response variant. If the whole response body is available when the response head is received, we no longer send the chunked response variant. Instead we respond with the framed variant. This allows us to use only onemirrord_protocol
message.IncomingProxy
now does subscription checks when receiving a new connection/request. If we receive a connection/request on a remote port that we no longer subscribe, we unsubscribe immediately, without attempting to connect to the user application.IncomingProxy
, e.g added time spent on polling response frames