Skip to content

Commit e37b98c

Browse files
Lijiachen1018lijiachen
andauthored
[feat] modify monkey patch for vllm-0.9.2 with cuda (#358)
monkey patch Co-authored-by: lijiachen <[email protected]>
1 parent c94b793 commit e37b98c

File tree

12 files changed

+3467
-16
lines changed

12 files changed

+3467
-16
lines changed

docs/source/getting-started/installation_gpu.md

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -44,16 +44,7 @@ export PLATFORM=cuda
4444
pip install -v -e . --no-build-isolation
4545
```
4646

47-
After installation, please apply patch to ensure uc_connector can be used:
48-
49-
```bash
50-
cd $(pip show vllm | grep Location | awk '{print $2}')
51-
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-pc.patch
52-
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-aggre.patch
53-
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-sparse.patch
54-
```
55-
56-
Refer to this [issue](https://github.com/vllm-project/vllm/issues/21702) to see details of this patch's changes.
47+
**Note:** Patches are now applied automatically via dynamic patching when you import the unified-cache-management package. You no longer need to manually apply patches using `git apply`. The patches are automatically applied when you use the `UnifiedCacheConnectorV1` connector.
5748

5849
## Setup from docker
5950
Download the pre-built `vllm/vllm-openai:v0.9.2` docker image and build unified-cache-management docker image by commands below:

docs/source/getting-started/installation_npu.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -39,16 +39,15 @@ docker run --rm \
3939
-v /root/.cache:/root/.cache \
4040
-it $IMAGE bash
4141
```
42-
Codes of vLLM and vLLM Ascend are placed in /vllm-workspace, you can refer to [vLLM-Ascend Installation](https://vllm-ascend.readthedocs.io/en/latest/installation.html) for more information. After installation, please apply patches to ensure uc_connector can be used:
43-
```bash
44-
cd /vllm-workspace/vllm
45-
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-pc.patch
46-
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-aggre.patch
47-
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-adapt-sparse.patch
42+
Codes of vLLM and vLLM Ascend are placed in /vllm-workspace, you can refer to [vLLM-Ascend Installation](https://vllm-ascend.readthedocs.io/en/latest/installation.html) for more information.
4843

44+
**Note:** For vLLM patches, they are now applied automatically via dynamic patching when you import the unified-cache-management package. However, for vLLM-Ascend, you still need to manually apply the vLLM-Ascend specific patch:
45+
46+
```bash
4947
cd /vllm-workspace/vllm-ascend
5048
git apply /vllm-workspace/unified-cache-management/ucm/integration/vllm/patch/0.9.2/vllm-ascend-adapt.patch
5149
```
50+
5251
Refer to these issues [vllm-issue](https://github.com/vllm-project/vllm/issues/21702) and [vllm-ascend-issue](https://github.com/vllm-project/vllm-ascend/issues/2057) to see details of patches' changes.
5352

5453
### Build from source code

ucm/__init__.py

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
#
2+
# MIT License
3+
#
4+
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
5+
#
6+
# Permission is hereby granted, free of charge, to any person obtaining a copy
7+
# of this software and associated documentation files (the "Software"), to deal
8+
# in the Software without restriction, including without limitation the rights
9+
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10+
# copies of the Software, and to permit persons to whom the Software is
11+
# furnished to do so, subject to the following conditions:
12+
#
13+
# The above copyright notice and this permission notice shall be included in all
14+
# copies or substantial portions of the Software.
15+
#
16+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17+
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18+
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19+
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20+
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21+
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22+
# SOFTWARE.
23+
#
24+
25+
"""
26+
vLLM integration module for Unified Cache Management.
27+
28+
This module automatically applies patches to vLLM when imported,
29+
eliminating the need for manual `git apply` commands.
30+
"""
31+
32+
# Auto-apply patches when this module is imported
33+
try:
34+
from ucm.integration.vllm.patch.apply_patch import ensure_patches_applied
35+
36+
ensure_patches_applied()
37+
except Exception as e:
38+
# Don't fail if patches can't be applied - might be running in environment without vLLM
39+
import warnings
40+
41+
warnings.warn(
42+
f"Failed to apply vLLM patches: {e}. "
43+
f"If you're using vLLM, ensure it's installed and patches are compatible."
44+
)
45+
46+
from ucm.integration.vllm.uc_connector import UnifiedCacheConnectorV1
47+
48+
__all__ = ["UnifiedCacheConnectorV1"]

ucm/integration/vllm/patch/__init__.py

Whitespace-only changes.
Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
#
2+
# MIT License
3+
#
4+
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
5+
#
6+
# Permission is hereby granted, free of charge, to any person obtaining a copy
7+
# of this software and associated documentation files (the "Software"), to deal
8+
# in the Software without restriction, including without limitation the rights
9+
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10+
# copies of the Software, and to permit persons to whom the Software is
11+
# furnished to do so, subject to the following conditions:
12+
#
13+
# The above copyright notice and this permission notice shall be included in all
14+
# copies or substantial portions of the Software.
15+
#
16+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17+
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18+
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19+
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20+
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21+
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22+
# SOFTWARE.
23+
#
24+
"""
25+
Monkey patching module for vLLM to apply UCM patches automatically.
26+
This replaces the need for manual `git apply` commands.
27+
"""
28+
29+
import sys
30+
from typing import Optional
31+
32+
from ucm.logger import init_logger
33+
34+
logger = init_logger(__name__)
35+
36+
import os
37+
38+
PLATFORM = os.getenv("PLATFORM")
39+
40+
41+
def _patch_ascend() -> bool:
42+
return PLATFORM == "ascend"
43+
44+
45+
# Track if patches have been applied
46+
_patches_applied = False
47+
_import_hook_installed = False
48+
_vllm_version: Optional[str] = None
49+
_vllm_import_hook = None
50+
51+
52+
def get_vllm_version() -> Optional[str]:
53+
"""Detect vLLM version."""
54+
global _vllm_version
55+
if _vllm_version is not None:
56+
return _vllm_version
57+
58+
try:
59+
# Try to get version from vllm module
60+
import vllm as vllm_pkg
61+
62+
vllm_version = vllm_pkg.__version__
63+
return vllm_version
64+
except ImportError:
65+
logger.warning("vLLM is not installed")
66+
return None
67+
except Exception as e:
68+
logger.warning(f"Failed to detect vLLM version: {e}")
69+
return None
70+
71+
72+
def get_supported_versions() -> list[str]:
73+
"""Get list of supported vLLM versions."""
74+
return ["0.9.2"]
75+
76+
77+
def apply_all_patches() -> None:
78+
"""Apply all vLLM patches based on detected version."""
79+
global _patches_applied
80+
if _patches_applied:
81+
return
82+
83+
try:
84+
version = get_vllm_version()
85+
if version is None:
86+
raise ValueError("Could not detect vLLM version")
87+
88+
supported_versions = get_supported_versions()
89+
if version not in supported_versions:
90+
logger.warning(
91+
f"vLLM version {version} is not explicitly supported. "
92+
f"Supported versions: {', '.join(supported_versions)}. "
93+
f"Attempting to apply 0.9.2 patches..."
94+
)
95+
raise ValueError(f"vLLM version {version} is not explicitly supported")
96+
97+
# Apply version-specific patches
98+
if version == "0.9.1":
99+
_apply_patches_v091()
100+
elif version == "0.9.2":
101+
_apply_patches_v092()
102+
else:
103+
raise ValueError(f"Unsupported vLLM version: {version}")
104+
105+
_patches_applied = True
106+
logger.info(f"All vLLM patches applied successfully for version {version}")
107+
except Exception as e:
108+
logger.error(f"Failed to apply vLLM patches: {e}", exc_info=True)
109+
raise
110+
111+
112+
def _apply_patches_v091() -> None:
113+
"""Apply patches for vLLM 0.9.1."""
114+
from .patch_funcs.v091.vllm_adapt import _apply_adapt_patch
115+
116+
_apply_adapt_patch() # apply vllm-adapt-pc.patch
117+
if _patch_ascend():
118+
from .patch_funcs.v091.vllm_ascend_adapt import _apply_ascend_patch
119+
120+
_apply_ascend_patch() # apply vllm-ascend-adapt.patch
121+
122+
123+
def _apply_patches_v092() -> None:
124+
"""Apply patches for vLLM 0.9.2."""
125+
from .patch_funcs.v092.vllm_adapt import _apply_adapt_patches
126+
127+
_apply_adapt_patches()
128+
129+
if _patch_ascend():
130+
from .patch_funcs.v092.vllm_ascend_adapt import _apply_ascend_patch
131+
132+
_apply_ascend_patch() # apply vllm-ascend-adapt.patch
133+
134+
135+
def install_import_hook() -> None:
136+
"""Install an import hook to automatically apply patches when vLLM is imported."""
137+
global _import_hook_installed, _vllm_import_hook
138+
139+
if _import_hook_installed:
140+
return
141+
142+
try:
143+
# Check if vLLM is already imported
144+
if "vllm" in sys.modules:
145+
# vLLM already imported, apply patches immediately
146+
apply_all_patches()
147+
_import_hook_installed = True
148+
else:
149+
# Install import hook by wrapping the builtin __import__ function
150+
# This intercepts all imports and applies patches when vLLM is imported
151+
import builtins
152+
153+
original_import = builtins.__import__
154+
155+
def import_hook(name, globals=None, locals=None, fromlist=(), level=0):
156+
# Call original import
157+
module = original_import(name, globals, locals, fromlist, level)
158+
159+
# If the main vLLM module is being imported, apply patches
160+
# We only check for 'vllm' (not submodules) to avoid multiple patch attempts
161+
if name == "vllm" and not _patches_applied:
162+
try:
163+
apply_all_patches()
164+
except Exception as e:
165+
logger.warning(f"Failed to apply patches during import: {e}")
166+
167+
return module
168+
169+
# Replace builtin __import__
170+
builtins.__import__ = import_hook
171+
_vllm_import_hook = import_hook
172+
_import_hook_installed = True
173+
logger.debug("Import hook installed to intercept vLLM imports")
174+
175+
except Exception as e:
176+
logger.warning(f"Failed to install import hook: {e}")
177+
178+
179+
def ensure_patches_applied() -> None:
180+
"""Ensure patches are applied, installing import hook if needed."""
181+
if not _patches_applied:
182+
# Try to apply patches immediately
183+
try:
184+
apply_all_patches()
185+
except Exception:
186+
# If it fails (vLLM not imported yet), install hook
187+
install_import_hook()

ucm/integration/vllm/patch/patch_funcs/__init__.py

Whitespace-only changes.

ucm/integration/vllm/patch/patch_funcs/v091/__init__.py

Whitespace-only changes.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
#
2+
# MIT License
3+
#
4+
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
5+
#
6+
# Permission is hereby granted, free of charge, to any person obtaining a copy
7+
# of this software and associated documentation files (the "Software"), to deal
8+
# in the Software without restriction, including without limitation the rights
9+
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10+
# copies of the Software, and to permit persons to whom the Software is
11+
# furnished to do so, subject to the following conditions:
12+
#
13+
# The above copyright notice and this permission notice shall be included in all
14+
# copies or substantial portions of the Software.
15+
#
16+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17+
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18+
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19+
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20+
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21+
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22+
# SOFTWARE.
23+
#
24+
25+
26+
def _apply_adapt_patch() -> None:
27+
"""Apply patches for vLLM 0.9.1."""
28+
raise NotImplementedError("vLLM 0.9.1 is not supported for Ascend")
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
#
2+
# MIT License
3+
#
4+
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All rights reserved.
5+
#
6+
# Permission is hereby granted, free of charge, to any person obtaining a copy
7+
# of this software and associated documentation files (the "Software"), to deal
8+
# in the Software without restriction, including without limitation the rights
9+
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10+
# copies of the Software, and to permit persons to whom the Software is
11+
# furnished to do so, subject to the following conditions:
12+
#
13+
# The above copyright notice and this permission notice shall be included in all
14+
# copies or substantial portions of the Software.
15+
#
16+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17+
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18+
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19+
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20+
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21+
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22+
# SOFTWARE.
23+
#
24+
25+
26+
def _apply_ascend_patch() -> None:
27+
"""Apply patches for vLLM 0.9.1."""
28+
raise NotImplementedError("vLLM 0.9.1 is not supported for Ascend")

ucm/integration/vllm/patch/patch_funcs/v092/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)