[WIP] interleave prefill steps with a decode step #531

sducouedic · 2025-10-15T16:48:31Z

Interleave consecutive prefill operations are interleaved with a decode step to minimize interruptions of current running requests. This mitigates the peaks in inter-token latency (ITL). A prefill is skipped if the previous step was also a prefill.

…refills Signed-off-by: Sophie du Couédic <[email protected]>

Signed-off-by: Sophie du Couédic <[email protected]>

github-actions · 2025-10-15T16:48:52Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

yannicks1 · 2025-10-15T20:23:37Z

vllm_spyre/v1/core/scheduler.py

        max_prompt_batch_size = 1
        max_context_len = self.scheduler_config.max_model_len

+        # two consecutive prefill steps are now allowed


nit: I guess you mean not allowed, not now allowed in the comment

maxdebayser · 2025-10-16T13:17:23Z

I think it makes sense to rate limit the prefills, but shouldn't this be regulated by how many requests are in the batch? Let's say that the current batch is completely empty, and I send N requests, then it will take 2*N to ramp up to full compute utilization.

sducouedic added 3 commits September 22, 2025 11:35

use variable in scheduler to track prefills and prevent consecutive p…

27cac82

…refills Signed-off-by: Sophie du Couédic <[email protected]>

merge with main

9f37aaa

Signed-off-by: Sophie du Couédic <[email protected]>

move variable parent class to ContinuousBatching scheduler

3294876

Signed-off-by: Sophie du Couédic <[email protected]>

yannicks1 reviewed Oct 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] interleave prefill steps with a decode step #531

[WIP] interleave prefill steps with a decode step #531

Uh oh!

sducouedic commented Oct 15, 2025

Uh oh!

github-actions bot commented Oct 15, 2025

Uh oh!

yannicks1 Oct 15, 2025

Uh oh!

maxdebayser commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WIP] interleave prefill steps with a decode step #531

Are you sure you want to change the base?

[WIP] interleave prefill steps with a decode step #531

Uh oh!

Conversation

sducouedic commented Oct 15, 2025

Uh oh!

github-actions bot commented Oct 15, 2025

Uh oh!

yannicks1 Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

maxdebayser commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants