-
Notifications
You must be signed in to change notification settings - Fork 653
[CI] Fix ngram & suffix test oom #4755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to fix an Out-Of-Memory (OOM) error in CI for ngram and suffix tests. The changes involve replacing direct LLM instantiation with the VllmRunner context manager, which ensures proper resource cleanup. Additionally, the multiprocessing start method is set to 'spawn' to avoid issues with NPU context inheritance. The test test_ngram_correctness is also re-enabled.
The changes are logical and address the OOM issue effectively. My main feedback is regarding how the environment variable is set. Using os.environ at the module level can introduce side effects to other tests. I've suggested a more robust approach using pytest's monkeypatch fixture to scope the change correctly.
|
|
||
| from tests.e2e.conftest import VllmRunner | ||
|
|
||
| os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying os.environ directly at the module level can lead to side effects in other tests, as it's a global state change that persists throughout the pytest session. This can make tests flaky and hard to debug.
A safer and more idiomatic pytest approach is to use the monkeypatch fixture to manage environment variables. This ensures that the change is properly scoped and cleaned up after the tests in this module are done.
You could define a module-scoped autouse fixture like this, which would replace the direct modification of os.environ:
import pytest
@pytest.fixture(scope="module", autouse=True)
def set_spawn_method(monkeypatch):
monkeypatch.setenv("VLLM_WORKER_MULTIPROC_METHOD", "spawn")This would make the test suite more robust against side-effects.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
6e64010 to
44dd0b0
Compare
wangxiyuan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hope the test can works now
Signed-off-by: fluctlux <[email protected]>
Signed-off-by: fluctlux <[email protected]>
44ec0db to
ad3444f
Compare
|
really nice change. Thanks very much!! |
### What this PR does / why we need it? Avoid oom during CI by using `with VllmRunner` instead of `LLM()`, and enable `test_ngram_correctness` ### How was this patch tested? CI passed. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: fluctlux <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Signed-off-by: yuxingcyx <[email protected]>
### What this PR does / why we need it? Avoid oom during CI by using `with VllmRunner` instead of `LLM()`, and enable `test_ngram_correctness` ### How was this patch tested? CI passed. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: fluctlux <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Signed-off-by: yuxingcyx <[email protected]>
### What this PR does / why we need it? Avoid oom during CI by using `with VllmRunner` instead of `LLM()`, and enable `test_ngram_correctness` ### How was this patch tested? CI passed. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: fluctlux <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Signed-off-by: tanqingshan (A) <[email protected]>
### What this PR does / why we need it? Avoid oom during CI by using `with VllmRunner` instead of `LLM()`, and enable `test_ngram_correctness` ### How was this patch tested? CI passed. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: fluctlux <[email protected]> Co-authored-by: wangxiyuan <[email protected]>
### What this PR does / why we need it? Avoid oom during CI by using `with VllmRunner` instead of `LLM()`, and enable `test_ngram_correctness` ### How was this patch tested? CI passed. - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e --------- Signed-off-by: fluctlux <[email protected]> Co-authored-by: wangxiyuan <[email protected]>
What this PR does / why we need it?
Avoid oom during CI by using
with VllmRunnerinstead ofLLM(), and enabletest_ngram_correctnessHow was this patch tested?
Before:


After:
CI passed.