Skip to content

Add vLLM NIXL PD smoke compatibility#313

Merged
cui36 merged 2 commits into
ovg-project:mainfrom
AAbouzeid:fix/pd-disagg-nixl-connector-minimal
Jun 1, 2026
Merged

Add vLLM NIXL PD smoke compatibility#313
cui36 merged 2 commits into
ovg-project:mainfrom
AAbouzeid:fix/pd-disagg-nixl-connector-minimal

Conversation

@AAbouzeid

@AAbouzeid AAbouzeid commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add an eager vLLM NixlConnector compatibility patch for kvcached PD disaggregation
  • align NIXL registration with kvcached KV tensor block counts and keep kvcached NIXL on the NHD path
  • add a hard guard for KVCACHED_CONTIGUOUS_LAYOUT=true with NIXL to avoid silent KV corruption
  • add a GPU smoke script that compares plain vLLM+NIXL against kvcached+NIXL

Validation

  • python -m pytest tests/test_vllm_nixl_compat.py -q
  • bash -n tools/run_vllm_nixl_pd_smoke.sh
  • pod smoke: INSTALL_VLLM=0 ./tools/run_vllm_nixl_pd_smoke.sh
  • pod semantic smoke prompts: arithmetic retrieval, ticket lookup, checksum extraction; each passed with and without kvcached

@cui36

cui36 commented Apr 25, 2026

Copy link
Copy Markdown
Collaborator

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements monkey-patches for vLLM's NixlConnector to ensure compatibility with kvcached during PD disaggregation. The changes force an NHD layout and synchronize the physical block count in NixlConnectorWorker with kvcached's internal allocation. A review comment identifies that the current block count synchronization only handles over-allocation and suggests updating the logic to handle any mismatch to avoid assertion failures.

Comment thread kvcached/integration/vllm/autopatch.py Outdated
@AAbouzeid

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces patches for vLLM's NixlConnector to ensure compatibility with kvcached PD disaggregation. It addresses layout incompatibilities by forcing NHD and resolves block count assertion failures by dynamically updating the connector's block count using a new tracking variable. Review feedback suggests that using a module-level global variable for this state is brittle and may cause issues in multi-engine or multi-tenant scenarios, recommending a more explicit configuration management approach instead.

Comment thread kvcached/integration/vllm/interfaces.py Outdated
Comment thread kvcached/integration/vllm/autopatch.py Outdated
Comment thread kvcached/integration/vllm/autopatch.py Outdated
Patch vLLM's NixlConnector path for kvcached by using the NHD layout, reconciling registered KV block counts with kvcached tensors, and rejecting kvcached's contiguous layout for NIXL because vLLM registers per-layer K/V regions as block-contiguous memory.

Add focused unit coverage and an end-to-end smoke script that compares plain vLLM+NIXL against kvcached+NIXL.
@AAbouzeid AAbouzeid force-pushed the fix/pd-disagg-nixl-connector-minimal branch from 53ba7ba to eae4a9f Compare May 24, 2026 19:53
@AAbouzeid AAbouzeid changed the title Patch NixlConnector for kvcached PD disaggregation (closes #302) Add vLLM NIXL PD smoke compatibility May 24, 2026
@cui36 cui36 merged commit 74887c9 into ovg-project:main Jun 1, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants