Skip to content

Commit 8b69018

Browse files
authored
[Feature] Add New Phi-3 Weight; Add Windows Compilation Steps (#4)
* add pyinstaller spec, update top_k default value; update readme * add modelui; update documentation * Update README.md * add windows executable compilation steps; update onnxruntime-genai-directml version --------- Co-authored-by: tjtanaa <[email protected]>
1 parent 11b609f commit 8b69018

File tree

4 files changed

+22
-6
lines changed

4 files changed

+22
-6
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,6 @@ test_phi3*
88
**.egg-info
99

1010
scripts/*.ps1
11-
scripts/*.sh
11+
scripts/*.sh
12+
**/dist
13+
**/build

README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Easiest way to launch OpenAI API Compatible Server on Windows, Linux and MacOS
2828
| Llama-2-13b-chat | 13B | 4096 | [EmbeddedLLM/llama-2-13b-chat-int4-onnx-directml](https://huggingface.co/EmbeddedLLM/llama-2-13b-chat-int4-onnx-directml) |
2929
| Llama-3-8b-chat | 8B | 8192 | [EmbeddedLLM/mistral-7b-instruct-v0.3-onnx](https://huggingface.co/EmbeddedLLM/mistral-7b-instruct-v0.3-onnx) |
3030
| Mistral-7b-v0.3-instruct | 7B | 32768 | [EmbeddedLLM/mistral-7b-instruct-v0.3-onnx](https://huggingface.co/EmbeddedLLM/mistral-7b-instruct-v0.3-onnx) |
31+
| Phi-3-mini-4k-instruct-062024 | 3.8B | 4096 | [EmbeddedLLM/Phi-3-mini-4k-instruct-062024-onnx](https://huggingface.co/EmbeddedLLM/Phi-3-mini-4k-instruct-062024-onnx/tree/main/onnx/directml/Phi-3-mini-4k-instruct-062024-int4) |
3132
| Phi3-mini-4k-instruct | 3.8B | 4096 | [microsoft/Phi-3-mini-4k-instruct-onnx](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) |
3233
| Phi3-mini-128k-instruct | 3.8B | 128k | [microsoft/Phi-3-mini-128k-instruct-onnx](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx) |
3334
| Phi3-medium-4k-instruct | 17B | 4096 | [microsoft/Phi-3-medium-4k-instruct-onnx-directml](https://huggingface.co/microsoft/Phi-3-medium-4k-instruct-onnx-directml) |
@@ -65,6 +66,9 @@ Easiest way to launch OpenAI API Compatible Server on Windows, Linux and MacOS
6566
- **CPU:** `ELLM_TARGET_DEVICE='cpu' pip install -e .[cpu, webui]`
6667
- **CUDA:** `ELLM_TARGET_DEVICE='cuda' pip install -e .[cuda, webui]`
6768

69+
**Note**
70+
1. If you are using Conda Environment. Install additional dependencies: `conda install conda-forge::vs2015_runtime`.
71+
6872
### Launch OpenAI API Compatible Server
6973

7074
```
@@ -102,6 +106,11 @@ You can find out the disk space required to download the model in the UI.
102106
![Model Management UI](asset/ellm_modelui.png)
103107

104108

109+
## Compile OpenAI-API Compatible Server into Windows Executable
110+
1. Install `embeddedllm`.
111+
2. Install PyInstaller: `pip install pyinstaller`.
112+
3. Compile Windows Executable: `pyinstaller .\ellm_api_server.spec`.
113+
4. You can find the executable in the `dist\ellm_api_server`.
105114

106115
## Acknowledgements
107116

requirements-directml.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
onnxruntime-directml~=1.18.0
2-
onnxruntime-genai-directml~=0.2.0
2+
onnxruntime-genai-directml~=0.3.0

src/embeddedllm/entrypoints/modelui.py

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,6 @@ class Config(BaseSettings):
4242

4343

4444
config = Config()
45-
import subprocess
46-
47-
from pydantic import BaseModel, Field
4845

4946

5047
class DeployedModel(BaseModel):
@@ -77,6 +74,14 @@ class ModelCard(BaseModel):
7774
repo_type="model",
7875
context_length=4096,
7976
),
77+
"EmbeddedLLM/Phi-3-mini-4k-instruct-062024-onnx": ModelCard(
78+
hf_url="https://huggingface.co/EmbeddedLLM/Phi-3-mini-4k-instruct-062024-onnx/tree/main/onnx/directml/Phi-3-mini-4k-instruct-062024-int4",
79+
repo_id="EmbeddedLLM/Phi-3-mini-4k-instruct-062024-onnx",
80+
model_name="Phi-3-mini-4k-instruct-062024-onnx",
81+
subfolder="onnx/directml/Phi-3-mini-4k-instruct-062024-int4",
82+
repo_type="model",
83+
context_length=4096,
84+
),
8085
"EmbeddedLLM/mistralai_Mistral-7B-Instruct-v0.3-int4": ModelCard(
8186
hf_url="https://huggingface.co/EmbeddedLLM/mistral-7b-instruct-v0.3-onnx/tree/main/onnx/directml/mistralai_Mistral-7B-Instruct-v0.3-int4",
8287
repo_id="EmbeddedLLM/mistral-7b-instruct-v0.3-onnx",
@@ -433,7 +438,7 @@ def main():
433438
<p style="font-size: 24px; font-weight: bold; color: #007bff;">Backend: {backend}</p>
434439
</div>
435440
"""
436-
big_block = gr.HTML(html_content)
441+
gr.HTML(html_content)
437442

438443
with gr.Accordion("See More Model Details", open=False):
439444
model_info_pandas_frame = gr.Dataframe(value=None)

0 commit comments

Comments
 (0)