Skip to content

Commit 0abb889

Browse files
authored
[FEAT] [DML] Update the requirements-directml.txt to the latest version (#42)
* Update the `requirements-directml.txt` to the latest version * update readme for directml part
1 parent ff6536c commit 0abb889

File tree

2 files changed

+25
-11
lines changed

2 files changed

+25
-11
lines changed

README.md

Lines changed: 23 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
77
| Model architectures | Gemma <br/> Llama \* <br/> Mistral + <br/>Phi <br/> | | |
88
| Platform | Linux <br/> Windows | | |
99
| Architecture | x86 <br/> x64 <br/> | Arm64 | |
10-
| Hardware Acceleration | CUDA<br/>DirectML<br/>IpexLLM<br/>OpenVINO | QNN <br/> ROCm | |
10+
| Hardware Acceleration | CUDA<br/>DirectML<br/>IpexLLM | QNN <br/> ROCm | OpenVINO |
1111

1212
\* The Llama model architecture supports similar model families such as CodeLlama, Vicuna, Yi, and more.
1313

@@ -21,6 +21,8 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
2121
## Table Content
2222

2323
- [Supported Models](#supported-models-quick-start)
24+
- [Onnxruntime Models](./docs/model/onnxruntime_models.md)
25+
- [Ipex-LLM Models](./docs/model/ipex_models.md)
2426
- [Getting Started](#getting-started)
2527
- [Installation From Source](#installation)
2628
- [Launch OpenAI API Compatible Server](#launch-openai-api-compatible-server)
@@ -31,10 +33,22 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
3133
- [Acknowledgements](#acknowledgements)
3234

3335
## Supported Models (Quick Start)
34-
* Onnxruntime DirectML Models [Link](./docs/model/onnxruntime_directml_models.md)
35-
* Onnxruntime CPU Models [Link](./docs/model/onnxruntime_cpu_models.md)
36-
* Ipex-LLM Models [Link](./docs/model/ipex_models.md)
37-
* OpenVINO-LLM Models [Link](./docs/model/openvino_models.md)
36+
37+
| Models | Parameters | Context Length | Link |
38+
| --- | --- | --- | --- |
39+
| Gemma-2b-Instruct v1 | 2B | 8192 | [EmbeddedLLM/gemma-2b-it-onnx](https://huggingface.co/EmbeddedLLM/gemma-2b-it-onnx) |
40+
| Llama-2-7b-chat | 7B | 4096 | [EmbeddedLLM/llama-2-7b-chat-int4-onnx-directml](https://huggingface.co/EmbeddedLLM/llama-2-7b-chat-int4-onnx-directml) |
41+
| Llama-2-13b-chat | 13B | 4096 | [EmbeddedLLM/llama-2-13b-chat-int4-onnx-directml](https://huggingface.co/EmbeddedLLM/llama-2-13b-chat-int4-onnx-directml) |
42+
| Llama-3-8b-chat | 8B | 8192 | [EmbeddedLLM/mistral-7b-instruct-v0.3-onnx](https://huggingface.co/EmbeddedLLM/mistral-7b-instruct-v0.3-onnx) |
43+
| Mistral-7b-v0.3-instruct | 7B | 32768 | [EmbeddedLLM/mistral-7b-instruct-v0.3-onnx](https://huggingface.co/EmbeddedLLM/mistral-7b-instruct-v0.3-onnx) |
44+
| Phi-3-mini-4k-instruct-062024 | 3.8B | 4096 | [EmbeddedLLM/Phi-3-mini-4k-instruct-062024-onnx](https://huggingface.co/EmbeddedLLM/Phi-3-mini-4k-instruct-062024-onnx/tree/main/onnx/directml/Phi-3-mini-4k-instruct-062024-int4) |
45+
| Phi3-mini-4k-instruct | 3.8B | 4096 | [microsoft/Phi-3-mini-4k-instruct-onnx](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) |
46+
| Phi3-mini-128k-instruct | 3.8B | 128k | [microsoft/Phi-3-mini-128k-instruct-onnx](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx) |
47+
| Phi3-medium-4k-instruct | 17B | 4096 | [microsoft/Phi-3-medium-4k-instruct-onnx-directml](https://huggingface.co/microsoft/Phi-3-medium-4k-instruct-onnx-directml) |
48+
| Phi3-medium-128k-instruct | 17B | 128k | [microsoft/Phi-3-medium-128k-instruct-onnx-directml](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct-onnx-directml) |
49+
| Openchat-3.6-8b | 8B | 8192 | [EmbeddedLLM/openchat-3.6-8b-20240522-onnx](https://huggingface.co/EmbeddedLLM/openchat-3.6-8b-20240522-onnx) |
50+
| Yi-1.5-6b-chat | 6B | 32k | [EmbeddedLLM/01-ai_Yi-1.5-6B-Chat-onnx](https://huggingface.co/EmbeddedLLM/01-ai_Yi-1.5-6B-Chat-onnx) |
51+
| Phi-3-vision-128k-instruct | | 128k | [EmbeddedLLM/Phi-3-vision-128k-instruct-onnx](https://huggingface.co/EmbeddedLLM/Phi-3-vision-128k-instruct-onnx/tree/main/onnx/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4) |
3852

3953
## Getting Started
4054

@@ -46,7 +60,7 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
4660

4761
1. Custom Setup:
4862

49-
- **IPEX(XPU)**: Requires anaconda environment. `conda create -n ellm python=3.11 libuv; conda activate ellm`.
63+
- **IPEX(XPU)**: Requires anaconda environment. `conda create -n ellm python=3.10 libuv; conda activate ellm`.
5064
- **DirectML**: If you are using Conda Environment. Install additional dependencies: `conda install conda-forge::vs2015_runtime`.
5165

5266
2. Install embeddedllm package. `$env:ELLM_TARGET_DEVICE='directml'; pip install -e .`. Note: currently support `cpu`, `directml` and `cuda`.
@@ -67,7 +81,7 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
6781

6882
1. Custom Setup:
6983

70-
- **IPEX(XPU)**: Requires anaconda environment. `conda create -n ellm python=3.11 libuv; conda activate ellm`.
84+
- **IPEX(XPU)**: Requires anaconda environment. `conda create -n ellm python=3.10 libuv; conda activate ellm`.
7185
- **DirectML**: If you are using Conda Environment. Install additional dependencies: `conda install conda-forge::vs2015_runtime`.
7286

7387
2. Install embeddedllm package. `ELLM_TARGET_DEVICE='directml' pip install -e .`. Note: currently support `cpu`, `directml` and `cuda`.
@@ -107,7 +121,7 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
107121
108122
### Launch Chatbot Web UI
109123
110-
1. `ellm_chatbot --port 7788 --host localhost --server_port <ellm_server_port> --server_host localhost --model_name <served_model_name>`. **Note:** To find out more of the supported arguments. `ellm_chatbot --help`.
124+
1. `ellm_chatbot --port 7788 --host localhost --server_port <ellm_server_port> --server_host localhost`. **Note:** To find out more of the supported arguments. `ellm_chatbot --help`.
111125
112126
![asset/ellm_chatbot_vid.webp](asset/ellm_chatbot_vid.webp)
113127
@@ -135,7 +149,7 @@ It is an interface that allows you to download and deploy OpenAI API compatible
135149
ellm_server --model_path <path/to/model/weight>
136150
137151
# DirectML
138-
ellm_server --model_path 'EmbeddedLLM_Phi-3-mini-4k-instruct-062024-onnx\onnx\directml\Phi-3-mini-4k-instruct-062024-int4' --port 5555
152+
ellm_server --model_path 'EmbeddedLLM/Phi-3-mini-4k-instruct-onnx-directml' --port 5555
139153
140154
# IPEX-LLM
141155
ellm_server --model_path '.\meta-llama_Meta-Llama-3.1-8B-Instruct\' --backend 'ipex' --device 'xpu' --port 5555 --served_model_name 'meta-llama_Meta/Llama-3.1-8B-Instruct'

requirements-directml.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
torch==2.3.1
22
torchvision~=0.18.1
3-
onnxruntime-directml~=1.18.0
4-
onnxruntime-genai-directml~=0.3.0
3+
onnxruntime-directml~=1.19.2
4+
onnxruntime-genai-directml~=0.4.0
55
transformers>=4.43.3

0 commit comments

Comments
 (0)