feat: implement --served-model-name CLI option by ericcurtin · Pull Request #158 · Inferact/vllm-frontend-rs

ericcurtin · 2026-05-05T13:16:23Z

Summary

Add `--served-model-name` CLI argument (zero or more values) to expose model aliases via the OpenAI API, matching vllm's behavior
`GET /v1/models` now returns one entry per served name; all endpoints accept any served name in the `model` request field and echo back the first (primary) name in responses
Removed `served_model_name` from `UnsupportedArgs` now that it is fully implemented
`AppState::new` asserts (not just debug-asserts) that the served names list is non-empty, so misconfiguration fails fast in release builds too

Behavior

When no `--served-model-name` is given, the backend model path is used as the single served name (no change in default behavior).

```
vllm-rs serve Qwen/Qwen3-0.6B --served-model-name qwen3 my-alias
```

`GET /v1/models` → returns `qwen3` and `my-alias`
`POST /v1/chat/completions` with `"model": "my-alias"` → accepted, response contains `"model": "qwen3"`

Review notes

The `EngineCoreClientConfig::model_name` intentionally keeps the backend model path (`config.model`) rather than the first served alias — it is used for the engine protocol handshake, not for API labeling.

gemini-code-assist

Code Review

This pull request introduces support for multiple served model names in the API, allowing the server to respond to several identifiers while designating the first as the primary ID for responses. Changes include adding the --served-model-name CLI argument, updating the server configuration and state to handle a list of names, and modifying request validation logic across HTTP and gRPC routes. Additionally, the /v1/models endpoint now returns all configured names. Feedback was provided regarding the use of debug_assert! for validating the non-empty invariant of served model names, suggesting a more robust check for production builds.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 13e0d4e359

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

ericcurtin · 2026-05-05T14:20:07Z

@BugenZhao PTAL

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e29086775a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

BugenZhao

LGTM. Thanks!

Add --served-model-name support, matching vllm's behavior: - Accept zero or more alias names via --served-model-name - GET /v1/models returns one entry per served name - POST completions/chat endpoints accept any served name in the model field and echo back the first (primary) name in responses - gRPC and /inference/v1/generate validate against all served names - Falls back to the backend model path when no names are specified - Removed served_model_name from UnsupportedArgs now that it is fully implemented

gemini-code-assist Bot reviewed May 5, 2026

View reviewed changes

Comment thread src/server/src/state.rs Outdated

chatgpt-codex-connector Bot reviewed May 5, 2026

View reviewed changes

Comment thread src/server/src/lib.rs

Comment thread src/server/src/state.rs Outdated

ericcurtin force-pushed the feat/served-model-name branch from 13e0d4e to e290867 Compare May 5, 2026 14:18

chatgpt-codex-connector Bot reviewed May 5, 2026

View reviewed changes

Comment thread src/cmd/src/cli.rs

BugenZhao requested review from BugenZhao and njhill May 5, 2026 16:11

BugenZhao approved these changes May 6, 2026

View reviewed changes

ericcurtin force-pushed the feat/served-model-name branch from e290867 to 26449b1 Compare May 6, 2026 08:17

BugenZhao enabled auto-merge (squash) May 6, 2026 13:28

BugenZhao merged commit b7bc63d into Inferact:main May 6, 2026
3 checks passed

ericcurtin deleted the feat/served-model-name branch May 8, 2026 10:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement --served-model-name CLI option#158

feat: implement --served-model-name CLI option#158
BugenZhao merged 1 commit intoInferact:mainfrom
ericcurtin:feat/served-model-name

ericcurtin commented May 5, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

ericcurtin commented May 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

BugenZhao left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ericcurtin commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavior

Review notes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

ericcurtin commented May 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

BugenZhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericcurtin commented May 5, 2026 •

edited

Loading