-
Notifications
You must be signed in to change notification settings - Fork 76
test(mlx): add MLX backend E2E tests + macos-latest CI workflow #1398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
f7952f6
7a7fbf4
35840de
f7dfb28
39d0a58
d84ae03
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,129 @@ | ||
| name: PR Test (MLX) | ||
|
|
||
| on: | ||
| push: | ||
| branches: [main] | ||
| paths: | ||
| - "crates/grpc_client/proto/mlx_engine.proto" | ||
| - "crates/grpc_client/src/mlx_engine.rs" | ||
| - "crates/grpc_client/python/**" | ||
| - "grpc_servicer/smg_grpc_servicer/mlx/**" | ||
| - "grpc_servicer/pyproject.toml" | ||
| - "e2e_test/mlx/test_mlx_backend.py" | ||
| - "e2e_test/infra/__init__.py" | ||
| - "e2e_test/infra/model_specs.py" | ||
| - "e2e_test/infra/worker.py" | ||
| - "e2e_test/infra/constants.py" | ||
| - ".github/workflows/pr-test-mlx.yml" | ||
|
coderabbitai[bot] marked this conversation as resolved.
|
||
| pull_request: | ||
| branches: [main] | ||
| types: [opened, synchronize, reopened] | ||
| paths: | ||
| - "crates/grpc_client/proto/mlx_engine.proto" | ||
| - "crates/grpc_client/src/mlx_engine.rs" | ||
| - "crates/grpc_client/python/**" | ||
| - "grpc_servicer/smg_grpc_servicer/mlx/**" | ||
|
Comment on lines
+22
to
+25
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
When a PR only changes the MLX branches in the router, e.g. Useful? React with 👍 / 👎. |
||
| - "grpc_servicer/pyproject.toml" | ||
| - "e2e_test/mlx/test_mlx_backend.py" | ||
| - "e2e_test/infra/__init__.py" | ||
| - "e2e_test/infra/model_specs.py" | ||
| - "e2e_test/infra/worker.py" | ||
| - "e2e_test/infra/constants.py" | ||
|
Comment on lines
+27
to
+31
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The MLX test module is run through Useful? React with 👍 / 👎. |
||
| - ".github/workflows/pr-test-mlx.yml" | ||
| workflow_dispatch: | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| concurrency: | ||
| group: mlx-tests-${{ github.ref }} | ||
| cancel-in-progress: ${{ github.event_name == 'pull_request' }} | ||
|
|
||
| jobs: | ||
| e2e-mlx: | ||
| name: E2E (MLX on Apple Silicon) | ||
| runs-on: macos-latest | ||
| timeout-minutes: 30 | ||
| permissions: | ||
| contents: read | ||
| env: | ||
| E2E_RUNTIME: mlx | ||
| E2E_ENGINE: mlx | ||
| PYTHONUNBUFFERED: "1" | ||
| steps: | ||
| - name: Checkout code | ||
| uses: actions/checkout@v6 | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v6 | ||
| with: | ||
| python-version: "3.12" | ||
|
|
||
| - name: Install Rust toolchain | ||
| uses: dtolnay/rust-toolchain@stable | ||
|
|
||
| - name: Install protoc | ||
| run: brew install protobuf | ||
|
|
||
| - name: Cache Cargo registry and target | ||
| uses: Swatinem/rust-cache@v2 | ||
| with: | ||
| shared-key: mlx-pr-test | ||
| cache-on-failure: true | ||
|
|
||
| - name: Install uv (for openapi codegen) | ||
| run: pip install uv | ||
|
|
||
| - name: Build smg-grpc-proto Python package (proto codegen) | ||
| run: pip install -e ./crates/grpc_client/python | ||
|
|
||
| - name: Install grpc_servicer with MLX extra (mlx + mlx-lm) | ||
| run: pip install -e "./grpc_servicer[mlx]" | ||
|
|
||
| - name: Build and install SMG Python bindings (ci profile) | ||
| working-directory: bindings/python | ||
| run: | | ||
| pip install maturin | ||
| # `ci` profile (opt-level=2, thin LTO, 16 codegen-units) — faster | ||
| # to compile than release, runtime still plenty fast for a | ||
| # correctness E2E test. | ||
| # Use `maturin build` + `pip install` (not `maturin develop`) | ||
| # because the GitHub-hosted runner's Python is not in a virtualenv. | ||
| maturin build --profile ci --out dist | ||
| pip install dist/*.whl | ||
|
|
||
| - name: Generate Python client types (required by e2e_test/conftest.py) | ||
| run: make generate-python-types | ||
|
|
||
| - name: Install smg-client | ||
| run: pip install ./clients/python | ||
|
|
||
| - name: Install E2E test dependencies | ||
| run: pip install ./e2e_test | ||
|
|
||
| - name: Verify imports | ||
| run: | | ||
| python -c "from smg_grpc_proto import mlx_engine_pb2, mlx_engine_pb2_grpc; print('proto OK')" | ||
| python -c "from smg_grpc_servicer.mlx.servicer import MlxEngineServicer; print('servicer OK')" | ||
| python -c "import smg; print('smg OK')" | ||
| python -c "from smg_client import SmgClient; print('smg_client OK')" | ||
| python -c "import mlx_lm; print('mlx-lm OK')" | ||
|
|
||
| - name: Run MLX E2E tests | ||
| env: | ||
| SHOW_WORKER_LOGS: "1" | ||
| SHOW_ROUTER_LOGS: "1" | ||
| E2E_LOG_DIR: e2e-logs | ||
| run: | | ||
| pytest e2e_test/mlx/test_mlx_backend.py \ | ||
| -s -vv \ | ||
| --reruns 1 --reruns-delay 5 | ||
|
|
||
|
key4ng marked this conversation as resolved.
|
||
| - name: Upload logs on failure | ||
| if: failure() || cancelled() | ||
| uses: actions/upload-artifact@v7 | ||
| with: | ||
| name: e2e-mlx-logs | ||
| path: e2e-logs/ | ||
| retention-days: 7 | ||
| if-no-files-found: ignore | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -178,6 +178,8 @@ def _build_cmd(self) -> list[str]: | |
| return self._build_vllm_http_cmd(model_path, tp_size, spec) | ||
| elif self.engine == "trtllm": | ||
| return self._build_trtllm_cmd(model_path, tp_size, spec) | ||
| elif self.engine == "mlx": | ||
| return self._build_mlx_cmd(model_path, spec) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 Nit: if self.engine != "mlx":
env["CUDA_VISIBLE_DEVICES"] = ",".join(map(str, self.gpu_ids))Not blocking — just a readability thing for the next person who reads |
||
| else: | ||
|
key4ng marked this conversation as resolved.
|
||
| raise ValueError(f"Unsupported engine: {self.engine}") | ||
|
|
||
|
|
@@ -261,6 +263,30 @@ def _build_vllm_base_cmd( | |
| cmd.extend(extra) | ||
| return cmd | ||
|
|
||
| def _build_mlx_cmd(self, model_path: str, spec: dict) -> list[str]: | ||
| """Build MLX gRPC server command (Apple Silicon only). | ||
|
|
||
| MLX backend only supports gRPC mode (no HTTP variant) since the | ||
| servicer wraps mlx-lm's BatchGenerator behind the MlxEngine proto. | ||
| """ | ||
| if self.mode != ConnectionMode.GRPC: | ||
| raise ValueError("MLX backend only supports gRPC mode") | ||
| cmd = [ | ||
| "python3", | ||
| "-m", | ||
| "smg_grpc_servicer.mlx.server", | ||
| "--model", | ||
| model_path, | ||
| "--host", | ||
| DEFAULT_HOST, | ||
| "--port", | ||
| str(self.port), | ||
| ] | ||
| extra = spec.get("mlx_args", []) | ||
| if extra: | ||
| cmd.extend(extra) | ||
| return cmd | ||
|
|
||
| def _build_trtllm_cmd(self, model_path: str, tp_size: int, spec: dict) -> list[str]: | ||
| """Build TensorRT-LLM gRPC server command.""" | ||
| # Create config file to enable xgrammar guided decoding | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new workflow watches several MLX-related E2E infra files but omits
e2e_test/infra/__init__.py, even though this commit changes that module ande2e_test/conftest.pyimports frominfraat startup. A future PR that only updatesinfra/__init__.pycan break MLX test startup/import resolution without triggering this workflow, so regressions can merge untested.Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added