[Frontend][RFC] Rust front-end integration#40848
[Frontend][RFC] Rust front-end integration#40848njhill wants to merge 15 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Nick Hill <nickhill123@gmail.com> Co-authored-by: Bugen Zhao <i@bugenzhao.com> Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
Documentation preview: https://vllm--40848.org.readthedocs.build/en/40848/ |
There was a problem hiding this comment.
Code Review
This pull request introduces a Rust-based frontend to vLLM, adding a git submodule and updating the build system to support native extensions via setuptools-rust. It includes a new process manager for the Rust binary and environment variables for configuration. Feedback identifies a future-dated toolchain version, fragile environment variable parsing, and an inconsistent path to the Rust manifest file in the setup configuration.
| @@ -0,0 +1,2 @@ | |||
| [toolchain] | |||
| channel = "nightly-2026-03-10" | |||
There was a problem hiding this comment.
BTW, why is the nightly toolchain required?
There was a problem hiding this comment.
It's because we're heavily using coroutines which are still unstable (in grammar) features yet for async stream.
There was a problem hiding this comment.
I'm concerned about the use of nightlies and experimental coroutine_trait language feature. It comes with a risk and will inflict pain on downstream rebuilds. I would prefer to have vLLM use a stable MSRV instead.
There was a problem hiding this comment.
Regarding coroutine-style async streams, I think there are several alternatives that work under stable toolchain, at the cost of a bit performance overhead, which should be acceptable.
I would agree that there would be much less concern if we can find an alternative to get rid of unstable features and switch to a stable toolchain.
There was a problem hiding this comment.
Thank you for considering to use Rust stable feature! It would help our downstream testing and rebuilds a lot.
There was a problem hiding this comment.
Updates: I've replaced all (actually only a few) unstable language features used by the Rust frontend to stable alternative libraries / desugared syntax, so it now builds with stable toolchain!
| When enabled, resolves VLLM_RUST_FRONTEND_PATH ("auto" by default) | ||
| to the actual binary path. | ||
| """ | ||
| use_rust = bool(int(os.environ.get("VLLM_USE_RUST_FRONTEND", "0"))) |
There was a problem hiding this comment.
The boolean conversion bool(int(os.environ.get("VLLM_USE_RUST_FRONTEND", "0"))) is fragile. It will raise a ValueError if the environment variable is set to common boolean strings like "true" or "false". Consider using a more robust check similar to how VLLM_USE_PRECOMPILED_RUST is handled at line 572.
| use_rust = bool(int(os.environ.get("VLLM_USE_RUST_FRONTEND", "0"))) | |
| use_rust = os.environ.get("VLLM_USE_RUST_FRONTEND", "").strip().lower() in ("1", "true") |
| rust_extensions = [ | ||
| RustExtension( | ||
| target="vllm.vllm-rs", | ||
| path="rust/src/cmd/Cargo.toml", |
There was a problem hiding this comment.
The path to the Cargo.toml manifest for the RustExtension is inconsistent with the build script. build_rust.sh uses the manifest at the root of the submodule (rust/Cargo.toml), while this points to a sub-directory. This may lead to build failures or missing workspace dependencies if the submodule is structured as a workspace.
| path="rust/src/cmd/Cargo.toml", | |
| path="rust/Cargo.toml", |
There was a problem hiding this comment.
IIRC we cannot pass a virtual (workspace) manifest here, so this comment seems invalid.
Signed-off-by: Nick Hill <nickhill123@gmail.com>
| @@ -0,0 +1,3 @@ | |||
| [submodule "rust"] | |||
| path = rust | |||
There was a problem hiding this comment.
Nit: what about "rsrc" to match "csrc"? Or just putting it as a subdir to avoid adding another top level dir, like "csrc/rust/"
There was a problem hiding this comment.
Yeah we can decide on the best name. I considered this but it seemed to be non-standard. But now I realize csrc isn't really standard either so maybe you're right and rsrc would be better. I'm not keen on putting it under csrc.
There was a problem hiding this comment.
Personally I don't find it a common convention to use rsrc for the directory name of rust sources, also putting it under csrc will be more confusing. I would still incline to rust for this.
Signed-off-by: Nick Hill <nickhill123@gmail.com>
There was a problem hiding this comment.
Should it be moved under the vLLM organization? Would it be donated from the Inferact organization to the vLLM organization? Also, the vllm-frontend-rs repository does not have a formal LICENSE file.
There was a problem hiding this comment.
@chaunceyjiang yes we are just staging it in the Inferact org initially, where we had been experimenting.
It seems like the general preference (including mine) is to have the code live within a subdir of this main vllm repo (i.e. the submodule in this PR will contain the actual code that's currently in the vllm-frontend-rs repo). In which case there's no need to have a new repo under the vllm org anyhow.
Good point re the license, will add that asap.
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
| # checkout does not recurse submodules, and the Dockerfile only sees what's in | ||
| # the build context, so initialize the submodule here before building. | ||
| git submodule sync --recursive | ||
| git submodule update --init --recursive |
There was a problem hiding this comment.
until this point we've tried to avoid using git submodules as the UX can be a bit rough; we've preferred to use use cmakes FetchContent. I think it would be worth looking into that I think. At the very least i guess its probably worth discussing the use of git submodules given this would be the first (im personally Im generally ok with it but I know there's varying opinions in the community)
cc @tlrmchlsmth
There was a problem hiding this comment.
I do have pretty strong preferences against using submodules. At NeuralMagic in the DeepSparse days we found stale submodules to be a very annoying footgun.
If we're moving the code to in-tree it seems like we could avoid this. Or use FetchContent like Lucas suggests
There was a problem hiding this comment.
@LucasWilkinson I also agree that we should avoid using a submodule. This was only done here to stage things initially so that we could keep/show the integration scaffolding in this PR and still allow folks to pull it and build/try it.
We still need to have some final decision on where the rust code should live, which I think so far is leaning towards living in a subdir of the main vllm repo since it's tightly coupled and considered primarily an "internal" component (see discussion related to this in #40846)
One thing that @BugenZhao is a concerned about is how we retain the commit history though if we do move the code here, since there's already a fair amount of them in https://github.com/Inferact/vllm-frontend-rs. There's quite a lot of code now and it is useful to be able to see the provenance of different parts of it.
If we do want to try to keep it, the commits can be recreated in this repo with updated paths, but it would mean having a whole bunch of new commits on main in one go, or possibly doing a merge commit (rather than squash-merge) which we haven't used in the past. Would welcome any opinions/thoughts about this.
# Conflicts: # docker/Dockerfile Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
See corresponding RFC for introducing a rust-based alternative front-end process in vLLM #40846.
For now we have staged the poc implementation in https://github.com/Inferact/vllm-frontend-rs. This PR contains the logic to integrate it into vLLM.
You can try it out by building this branch:
Co-authored with @BugenZhao