Skip to content

Conversation

@makasim
Copy link

@makasim makasim commented Nov 20, 2025

A specially crafted remote-write request can declare an extremely large decoded length while providing only a small encoded payload. Prometheus allocates memory based on the declared decoded size, so a single request can trigger an allocation of ~2.5 GB. A few such requests are enough to crash the process with OOM.

Here's the script that can be used to reproduce the issue:

echo "97eab4890a170a085f5f6e616d655f5f120b746573745f6d6574726963121009000000000000f03f10d48fc9b2a333" \
  | xxd -r -p \
  | curl -X POST \
      "http://127.0.0.1:9090/api/v1/write" \
      -H "Content-Type: application/x-protobuf" \
      -H "Content-Encoding: snappy" \
      -H "X-Prometheus-Remote-Write-Version: 0.1.0" \
      --data-binary @-

This change adds a hard limit: the requested decoded length must be less than 32 MB. Requests exceeding the limit are rejected with HTTP 400 before any allocation occurs.

@makasim makasim force-pushed the fix-snappy-bug branch 2 times, most recently from c5b4a57 to 18310e4 Compare November 20, 2025 13:38
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the contribution.

We need some documentation.

}
}

var maxDecodedSize = 32 * 1024 * 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we document how did we come up with this number?
Is there another source of truth we need to be in sync with?

Copy link
Author

@makasim makasim Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the max remote-write size that VictoriaMetrics as const value. Users can override it via a flag -maxInsertRequestSize if needed. Prometheus previously had no limit, so enforcing 32 MB might break some setups.

Need some guidance here, should we raise or lower the value, or make it explicitly configurable?

@makasim
Copy link
Author

makasim commented Nov 21, 2025

We need some documentation.

I can take a look. Can you please point out where I can find the docs' sources?

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

We need some documentation.

I think @kakkoyun meant just a clear commentary.

We also need a unit test please, but thank you very much for adding this up 💪🏽 important!

@kakkoyun
Copy link
Member

I think @kakkoyun meant just a clear commentary.

Sorry, if it is not clear. Yes, that was what I meant.

@makasim
Copy link
Author

makasim commented Nov 25, 2025

Added const description and mw tests. Please review

@makasim makasim requested review from bwplotka and kakkoyun November 25, 2025 16:57
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Thanks for addressing the comments.

Could you add a draft entry to the CHANGELOG? We should mark this as a breaking change, I believe.

Also, you need to sign the DCO

A specially crafted remote-write request can declare an extremely large
decoded length while providing only a small encoded payload. Prometheus
allocates memory based on the declared decoded size, so a single request
can trigger an allocation of ~2.5 GB. A few such requests are enough to
crash the process with OOM.

Here's the script that can be used to reproduce the issue:

echo
"97eab4890a170a085f5f6e616d655f5f120b746573745f6d6574726963121009000000000000f03f10d48fc9b2a333"
\
  | xxd -r -p \
  | curl -X POST \
      "http://127.0.0.1:9090/api/v1/write" \
      -H "Content-Type: application/x-protobuf" \
      -H "Content-Encoding: snappy" \
      -H "X-Prometheus-Remote-Write-Version: 0.1.0" \
      --data-binary @-

This change adds a hard limit: the requested decoded length must be less
than 32 MB. Requests exceeding the limit are rejected with HTTP 400
before any allocation occurs.

Signed-off-by: Max Kotliar <[email protected]>
@makasim
Copy link
Author

makasim commented Nov 26, 2025

Added changelog, rebased&squashed.

@makasim
Copy link
Author

makasim commented Nov 27, 2025

@kakkoyun kind reminder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants