-
Notifications
You must be signed in to change notification settings - Fork 101
Dryrun implementation for generating command line file #723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 4 commits
8cbbd8a
2050a0f
dcadb3d
3579c63
381f585
2fff365
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,15 +1,49 @@ | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| import os | ||
| import shutil | ||
| import sys | ||
| import time | ||
| from pathlib import Path | ||
|
|
||
|
|
||
| def shutil_copy(source_file, destination_dir): | ||
| try: | ||
| src_path = Path(source_file) | ||
| dst_dir_path = Path(destination_dir) | ||
|
|
||
| dst_path = dst_dir_path / src_path.name | ||
|
|
||
| # Ensure the destination directory exists | ||
| dst_dir_path.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| shutil.copy(src_path, dst_path) | ||
| print(f"[Info] File '{source_file}' saved at '{dst_path}'") | ||
|
|
||
| except FileNotFoundError: | ||
| print(f"Error: The source file '{source_file}' was not found.") | ||
| except PermissionError: | ||
| print(f"Error: Permission denied. Cannot access '{source_file}' or write to '{destination_dir}'.") | ||
| except shutil.SameFileError: | ||
| print("Error: Source and destination files are the same.") | ||
| except Exception as e: | ||
| print(f"An unexpected error occurred: {e}") | ||
|
|
||
|
|
||
| class ScriptGenerator: | ||
|
|
||
| def __init__(self, template_script_path, output_script_path, variables, log_dir="logs", varlist_conf_path=None): | ||
| def __init__(self, | ||
| template_script_path, | ||
| output_script_path, | ||
| variables, | ||
| log_dir="logs", | ||
| dry_run_dir="/local/", | ||
| varlist_conf_path=None): | ||
| self.template_script_path = template_script_path | ||
| self.varlist_conf_path = varlist_conf_path | ||
| self.output_script_path = output_script_path | ||
| self.variables = variables | ||
| self.log_dir = log_dir | ||
| self.dry_run_dir = dry_run_dir | ||
| self.log_file = os.path.join(self.log_dir, | ||
| f"{os.path.splitext(os.path.basename(self.output_script_path))[0]}.log") | ||
|
|
||
|
|
@@ -56,5 +90,21 @@ def create_and_run(self): | |
|
|
||
| # Run the generated script and redirect output to log file | ||
| print(f"Starting script, logging to {self.log_file}") | ||
| os.makedirs(self.log_dir, exist_ok=True) | ||
| os.execvp("bash", ["bash", self.output_script_path]) | ||
| try: | ||
| os.makedirs(self.log_dir, exist_ok=True) | ||
| except Exception: | ||
| print(f"Error: could not create {self.log_dir}.") | ||
|
|
||
| if os.environ.get("DRY_RUN") == '1': | ||
| shutil_copy(self.output_script_path, self.dry_run_dir) | ||
|
|
||
| print(f"[INFO] This is a dry run to save the command line file {self.output_script_path}.") | ||
| try: | ||
|
||
| while True: | ||
| print("[INFO] Press Ctrl+C to exit.") | ||
| time.sleep(60) | ||
| except KeyboardInterrupt: | ||
| print("Exiting the DRY_RUN execution.") | ||
| sys.exit(0) | ||
| else: | ||
| os.execvp("bash", ["bash", self.output_script_path]) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -137,6 +137,70 @@ docker run -it --rm \ | |
|
|
||
| This method provides full flexibility over how the vLLM server is executed within the container. | ||
|
|
||
| ## Dry Run to create vLLM sever and client command line | ||
|
|
||
| Set environment variable **DRY_RUN=1** | ||
| DRY_RUN env var set to 1 create a copy of vllm-server.sh or vllm-benchmark.sh command line file on the host machine, without launching the server or the client. | ||
|
|
||
| Example - Docker Compose | ||
|
|
||
| ```bash | ||
| MODEL="Qwen/Qwen2.5-14B-Instruct" \ | ||
| HF_TOKEN="<your huggingface token>" \ | ||
| DOCKER_IMAGE="vault.habana.ai/gaudi-docker/{{ VERSION }}/ubuntu24.04/habanalabs/vllm-installer-{{ PT_VERSION }}:latest" \ | ||
| TENSOR_PARALLEL_SIZE=1 \ | ||
| MAX_MODEL_LEN=2048 \ | ||
| DRY_RUN=1 \ | ||
| docker compose up | ||
| ``` | ||
|
|
||
| Example - Docker Run | ||
|
|
||
| ```bash | ||
| docker run -it --rm \ | ||
| -e MODEL=$MODEL \ | ||
| -e HF_TOKEN=$HF_TOKEN \ | ||
| -e http_proxy=$http_proxy \ | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. DRY_RUN env is missing
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated the cmd line |
||
| -e https_proxy=$https_proxy \ | ||
| -e no_proxy=$no_proxy \ | ||
| --cap-add=sys_nice \ | ||
| --ipc=host \ | ||
| --runtime=habana \ | ||
| -e HABANA_VISIBLE_DEVICES=all \ | ||
| -p 8000:8000 \ | ||
| -v ${PWD}:/local \ | ||
| --name vllm-server \ | ||
| <docker image name> | ||
| ``` | ||
|
|
||
| !!! note | ||
| While launching the vLLM server using Docker Run command for Dry Run, make sure to mount the present working directory as `-v ${PWD}:/local`. | ||
|
|
||
| ## To save vLLM sever and client log files | ||
|
|
||
| If vLLM server is launched using Docker Compose command, the log files are saved at `vllm-gaudi/.cd/logs/` by default. | ||
|
|
||
| If vLLM server is launched using Docker Run command, the user can save the log files by creating a directory named `logs` and mount this log directory as `-v ${PWD}/logs:/root/scripts/logs`. | ||
|
|
||
| ## To create multiple vLLM services using Docker Compose | ||
|
|
||
| Set environment variables **HOST_PORT** and **COMPOSE_PROJECT_NAME** | ||
| Example | ||
|
|
||
| ```bash | ||
| MODEL="Qwen/Qwen2.5-14B-Instruct" \ | ||
| HF_TOKEN="<your huggingface token>" \ | ||
| DOCKER_IMAGE="vault.habana.ai/gaudi-docker/{{ VERSION }}/ubuntu24.04/habanalabs/vllm-installer-{{ PT_VERSION }}:latest" \ | ||
| TENSOR_PARALLEL_SIZE=1 \ | ||
| MAX_MODEL_LEN=2048 \ | ||
| HOST_PORT=9000 \ | ||
| COMPOSE_PROJECT_NAME=serv1 \ | ||
| docker compose up | ||
| ``` | ||
|
|
||
| !!! note | ||
| The default values, when these vars not set, are `HOST_PORT=8000` and `COMPOSE_PROJECT_NAME=cd`. | ||
|
|
||
| ## Pinning CPU Cores for Memory Access Coherence | ||
|
|
||
| To improve memory-access coherence and release CPUs to other CPU-only workloads, such as vLLM serving with Llama3 8B, you can pin CPU cores based on different CPU Non-Uniform Memory Access (NUMA) nodes using the automatically generated `docker-compose.override.yml` file. The following procedure explains the process. | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change this to "on-failure",. we may not need dry-run CTRL+C code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restart condition "on-failure is working. Tested for bad model name failure. The Dry_Run do not need CTRL+C anymore.