Skip to content

bug: VLM structured output quality degrades under concurrent requests (conc >= 2) #691

@mdevk

Description

@mdevk

Summary

JSON validity of VLM structured extraction drops significantly under concurrent requests on oMLX v0.3.4. Even concurrency=2 shows consistent degradation.

Data

Three separate runs at each concurrency level, 20 photos each, Qwen3-VL-30B-A3B-Instruct-4bit, structured JSON extraction prompt, temperature: 0.1:

Concurrency Sweep (unpatched v0.3.4)

Concurrency Run 1 Run 2 Run 3 Avg
1 100% (20/20) 100%
2 65% (13/20) 60% (12/20) 45% (9/20) 57%
4 100% (20/20) 100%
8 100% (20/20) 100%
16 70% (14/20) 70%

Failure Types

conc=2: Output contains corrupted/merged JSON. Example: {"description": "A young girl in a pink, "description": "A y — two separate responses appear concatenated.

conc=16: content: null with completion_tokens: 0 — model fails to generate any content. 5 of 6 failures were this type.

Key Observations

  1. conc=2 is consistently degraded (45-65% across 3 runs). This is NOT noise.
  2. conc=4 and conc=8 show 100% — the degradation is NOT monotonic with concurrency, suggesting a specific scheduling/batching edge case at low concurrency.
  3. conc=16 failures are content: null — the model produces no output at all.
  4. The corruption at conc=2 looks like cross-request contamination (merged JSON from different requests).

Correction

We previously reported 100% validity at conc=2 (on our review of PR #648). That measurement may have had different conditions or methodology. The data here is from a clean unpatched v0.3.4 install with 3 repeated runs.

Environment

  • oMLX v0.3.4 (Homebrew, unpatched)
  • Mac Studio M3 Ultra 96GB
  • Qwen3-VL-30B-A3B-Instruct-4bit
  • Structured JSON extraction prompt
  • 20 diverse photos per run

Not related to #648

We tested the IOKit underflow fix from #648 and saw no change in this behavior. The deferred-clear race is a different issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions