bump mlx-lm to 0.31.3 and mlx-vlm to latest main by Chedrian07 · Pull Request #675 · jundot/omlx

Chedrian07 · 2026-04-08T14:28:07Z

Summary

Bump mlx-lm pin from dcbf6e3 to d9c63ff (v0.31.3 patch bump, #1124)
Bump mlx-vlm pin from 23e1dff to 3472132 (5 upstream fixes, non-breaking)
Keep pyproject.toml dependencies, [tool.uv] override-dependencies, and packaging/venvstacks.toml in lockstep

mlx-vlm changes pulled in

Fix Gemma 4 tool parser to accept hyphenated function names (Fix Gemma 4 tool parser to accept hyphenated function names Blaizzy/mlx-vlm#963)
Fix Gemma 4 audio — mel preprocessing, weight loading, feature extractor (fix: Gemma 4 audio — mel preprocessing, weight loading, feature extractor Blaizzy/mlx-vlm#931)
Fix race condition in TurboQuant fused fast-quantize kernels (Fix race condition in TurboQuant fused fast-quantize kernels Blaizzy/mlx-vlm#967)
Fix Gemma 4 quantized per-layer projection loading (Fix Gemma 4 quantized per-layer projection loading Blaizzy/mlx-vlm#935)
Snapshot cache.offset to prevent alias mutation under batched caches (fix(gemma4): snapshot cache.offset to prevent alias mutation under batched caches Blaizzy/mlx-vlm#966)

mlx-lm changes pulled in

0.31.2 → 0.31.3 patch bump only (Bump the patch version ml-explore/mlx-lm#1124)

Both compare ranges (dcbf6e3..d9c63ff and 23e1dff..3472132) contain only bug fixes and a patch version bump. No public API, signature, or required-argument changes — so existing oMLX adapters (VLMModelAdapter, BatchedEngine, scheduler integration with BatchGenerator, Gemma 4 tool-call paths) should be fully compatible.

Rationale for picking HEAD of both main branches:

mlx-vlm@3472132 includes the Gemma 4 hyphenated-tool-name fix, which improves OpenAI-spec compliance for oMLX's Gemma 4 tool-calling path.
mlx-vlm@3472132 also fixes a TurboQuant kernel race that can affect quantized VLMs under oMLX continuous batching.
mlx-lm@d9c63ff is just a patch-version bump; staying in lockstep keeps the override free of surprises.

Test plan

pip install -e . resolves against the new git pins in a clean venv
pytest -m "not slow" passes
Smoke test: load a Gemma 4 VLM + hyphenated tool name → parsed correctly
Smoke test: load a quantized VLM under --max-concurrent-requests 8 → no TurboQuant race
packaging/build.py --skip-venv still builds the app bundle without resolver errors

mlx-lm: dcbf6e3 -> d9c63ff (v0.31.3 patch bump, #1124) mlx-vlm: 23e1dff -> 3472132 - Fix Gemma 4 tool parser to accept hyphenated function names (#963) - Fix Gemma 4 audio: mel preprocessing, weight loading, feature extractor (#931) - Fix race condition in TurboQuant fused fast-quantize kernels (#967) - Fix Gemma 4 quantized per-layer projection loading (#935) - Snapshot cache.offset to prevent alias mutation under batched caches (#966) Both ranges contain only bug fixes and the patch version bump; no API or signature changes. Updates pyproject.toml dependencies, the uv override-dependencies, and packaging/venvstacks.toml in lockstep.

kyr0 · 2026-04-10T09:08:28Z

I'm encountering issues with thread-local vs. thread-global behaviour from [email protected] on with Gemma 4. Is this PR well-tested? Do we have e2e integration tests?

I'm not going into details here because I'm working on custom mlx-lm and mlx forks. Having a hard time to pinpoint the exact root cause, but mlx 0.31.2 seems to introduce changes to thread locality and oMLX isn't prepared?

Ref tracking: ml-explore/mlx#3078

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bump mlx-lm to 0.31.3 and mlx-vlm to latest main#675

bump mlx-lm to 0.31.3 and mlx-vlm to latest main#675
Chedrian07 wants to merge 1 commit intojundot:mainfrom
Chedrian07:chore/bump-mlx-lm-vlm

Chedrian07 commented Apr 8, 2026

Uh oh!

kyr0 commented Apr 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Chedrian07 commented Apr 8, 2026

Summary

mlx-vlm changes pulled in

mlx-lm changes pulled in

Test plan

Uh oh!

kyr0 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kyr0 commented Apr 10, 2026 •

edited

Loading