Add vLLM NIXL PD smoke compatibility#313
Conversation
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements monkey-patches for vLLM's NixlConnector to ensure compatibility with kvcached during PD disaggregation. The changes force an NHD layout and synchronize the physical block count in NixlConnectorWorker with kvcached's internal allocation. A review comment identifies that the current block count synchronization only handles over-allocation and suggests updating the logic to handle any mismatch to avoid assertion failures.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces patches for vLLM's NixlConnector to ensure compatibility with kvcached PD disaggregation. It addresses layout incompatibilities by forcing NHD and resolves block count assertion failures by dynamically updating the connector's block count using a new tracking variable. Review feedback suggests that using a module-level global variable for this state is brittle and may cause issues in multi-engine or multi-tenant scenarios, recommending a more explicit configuration management approach instead.
Patch vLLM's NixlConnector path for kvcached by using the NHD layout, reconciling registered KV block counts with kvcached tensors, and rejecting kvcached's contiguous layout for NIXL because vLLM registers per-layer K/V regions as block-contiguous memory. Add focused unit coverage and an end-to-end smoke script that compares plain vLLM+NIXL against kvcached+NIXL.
53ba7ba to
eae4a9f
Compare
Summary
NixlConnectorcompatibility patch for kvcached PD disaggregationKVCACHED_CONTIGUOUS_LAYOUT=truewith NIXL to avoid silent KV corruptionValidation
python -m pytest tests/test_vllm_nixl_compat.py -qbash -n tools/run_vllm_nixl_pd_smoke.shINSTALL_VLLM=0 ./tools/run_vllm_nixl_pd_smoke.sh