feat(compose): optional gpu-coordinator-proxy + ollama services for VRAM contention by nnnet · Pull Request #650 · builderz-labs/mission-control

nnnet · 2026-05-05T21:50:36Z

Summary

Adds two optional docker compose services for operators running both LMStudio and Ollama on a single GPU:

gpu-coordinator-proxy — fronts LMStudio (:1234) and Ollama (:11434) on alternative ports (:1235, :11435) with a shared VRAM lock. Before forwarding to either backend it unloads everything from the other runtime; with GPU_FREE_STRATEGY=wipe-all it can also evict everything from the requested runtime itself for cold-start determinism.
ollama — official ollama/ollama:latest image with a named volume for model storage, exposing :11434/v1.

The proxy lives in its own repo at https://github.com/nnnet/gpu-coordinator-proxy and is cloned next to MC as ./gpu-coordinator-proxy-src/ (gitignored — the sibling clone is not part of MC's tree). The build.context in the compose file points at that sibling clone.

Why this is useful upstream

Operators running both runtimes on a single dev machine would otherwise have to manually unload models between provider switches. With the proxy, MC and the gateway just point lmstudio.baseUrl=http://host.docker.internal:1235/v1 and ollama.baseUrl=http://host.docker.internal:11435/v1 and the contention is invisible. The 5/6 swap matrix tested cleanly on a 24 GB RTX 5090.

The change is purely additive — omit the two service blocks in docker-compose-openclaw.yml to keep the existing direct-LMStudio / direct-Ollama paths unchanged.

Test plan

Clone the proxy: git clone https://github.com/nnnet/gpu-coordinator-proxy gpu-coordinator-proxy-src (next to mission-control/).
make up gpu-coordinator-proxy ollama — both services healthy.
Pull a model: docker exec mc-ollama ollama pull gpt-oss:20b.
Point an MC agent at http://127.0.0.1:1235/v1 (LMStudio) and another at http://127.0.0.1:11435/v1 (Ollama).
Dispatch tasks alternating between the two; observe in proxy logs that the other runtime gets evicted on each switch.

Dependencies

Touches docker-compose-openclaw.yml — same file as feat(openclaw): additive Docker integration with env-driven hardening + doctor cleanup #649 (OpenClaw integration). Reviewers can land in either order; later one rebases.

Provenance

Squashes our fork's commit:

3249ee7 feat(compose): add gpu-coordinator-proxy service from sibling repo clone

…RAM contention ## Summary Adds two **optional** docker compose services for operators running both LMStudio and Ollama on a single GPU: - **`gpu-coordinator-proxy`** — fronts LMStudio (`:1234`) and Ollama (`:11434`) on alternative ports (`:1235`, `:11435`) with a shared VRAM lock. Before forwarding to either backend it unloads everything from the *other* runtime; with `GPU_FREE_STRATEGY=wipe-all` it can also evict everything from the requested runtime itself for cold-start determinism. - **`ollama`** — official `ollama/ollama:latest` image with a named volume for model storage, exposing `:11434/v1`. The proxy lives in its own repo at https://github.com/nnnet/gpu-coordinator-proxy and is cloned next to MC as `./gpu-coordinator-proxy-src/` (gitignored — the sibling clone is not part of MC's tree). The `build.context` in the compose file points at that sibling clone. ## Why this is useful upstream Operators running both runtimes on a single dev machine would otherwise have to manually unload models between provider switches. With the proxy, MC and the gateway just point `lmstudio.baseUrl=http://host.docker.internal:1235/v1` and `ollama.baseUrl=http://host.docker.internal:11435/v1` and the contention is invisible. The 5/6 swap matrix tested cleanly on a 24 GB RTX 5090. The change is purely additive — omit the two service blocks in `docker-compose-openclaw.yml` to keep the existing direct-LMStudio / direct-Ollama paths unchanged. ## Test plan 1. Clone the proxy: `git clone https://github.com/nnnet/gpu-coordinator-proxy gpu-coordinator-proxy-src` (next to `mission-control/`). 2. `make up gpu-coordinator-proxy ollama` — both services healthy. 3. Pull a model: `docker exec mc-ollama ollama pull gpt-oss:20b`. 4. Point an MC agent at `http://127.0.0.1:1235/v1` (LMStudio) and another at `http://127.0.0.1:11435/v1` (Ollama). 5. Dispatch tasks alternating between the two; observe in proxy logs that the other runtime gets evicted on each switch. ## Dependencies - Touches `docker-compose-openclaw.yml` — same file as builderz-labs#649 (OpenClaw integration). Reviewers can land in either order; later one rebases. ## Provenance Squashes our fork's commit: - `3249ee7 feat(compose): add gpu-coordinator-proxy service from sibling repo clone`

0xbrainkid

Summary

Blocking: the optional GPU proxy is included as an always-on service with a required local build context, and the compose file also carries a machine-specific OpenClaw state path.

Strengths

The proxy/ports/eviction-strategy comments are clear and explain the intended LMStudio/Ollama topology well.
Keeping Ollama model data in a named volume is the right operational default.

Issues

[BLOCKING] gpu-coordinator-proxy is part of the default compose graph but builds from ./gpu-coordinator-proxy-src, which is intentionally gitignored and absent on a normal clone. docker compose up / make up will fail unless the operator manually clones the sibling repo or edits the compose file, even though the comments call the proxy optional. Please put this service behind a profile (for example profiles: ["gpu-proxy"]) or split it into an optional override compose file so the default OpenClaw stack starts without the external repo.
[BLOCKING] Same portability issue as the OpenClaw integration compose: OPENCLAW_STATE_DIR, OPENCLAW_PLUGIN_STAGE_DIR, OPENCLAW_CONFIG_PATH, and the state bind mount hard-code /mnt/9/gt/rig_PlatformsAI/mayor/rig/beads/discovered/mission-control/.openclaw-data. That path will be wrong for other clones/hosts. Please make it configurable/generated locally rather than committed.

Questions

Should Ollama also be profile-gated, or is it intended to be part of the default OpenClaw stack even for cloud-only users?

Verification

gh pr checks 650: no checks reported.
docker compose -f docker-compose-openclaw.yml config --quiet: passed syntax/config rendering, but does not validate that the optional build context exists or that the committed absolute path is portable.

0xNyk · 2026-05-07T04:21:13Z

Thanks — holding this one until #649 lands. They both modify docker-compose-openclaw.yml (this PR adds the gpu-coordinator-proxy and ollama services to the file that #649 introduces), so the merge order matters and the diff is currently hard to evaluate in isolation.

Once #649 lands, please rebase this onto the new main so the diff shows just the GPU-coordinator delta (the actual ~600 lines you're adding), not the full file. That'll make this trivially reviewable.

One question while you're rebasing:

The proxy service description mentions GPU_FREE_STRATEGY=wipe-all for cold-start determinism — that strategy evicts everything from the requested runtime as well, which on shared LMStudio/Ollama setups would kill any other concurrent workload. Could you add a note in the compose comment that operators on multi-user GPU hosts should leave it on the default wipe-other strategy? Just so the trade-off is visible at the spot where someone enables the flag.

Tagging as blocked on #649 for now. The technical content here looks sensible — VRAM contention is a real problem and a coordinator proxy is the right shape for it.

nnnet requested a review from 0xNyk as a code owner May 5, 2026 21:50

0xbrainkid suggested changes May 6, 2026

View reviewed changes

0xNyk mentioned this pull request May 7, 2026

feat(openclaw): additive Docker integration with env-driven hardening + doctor cleanup #649

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(compose): optional gpu-coordinator-proxy + ollama services for VRAM contention#650

feat(compose): optional gpu-coordinator-proxy + ollama services for VRAM contention#650
nnnet wants to merge 1 commit into
builderz-labs:mainfrom
nnnet:pr/gpu-coordinator-proxy

nnnet commented May 5, 2026

Uh oh!

0xbrainkid left a comment

Uh oh!

0xNyk commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nnnet commented May 5, 2026

Summary

Why this is useful upstream

Test plan

Dependencies

Provenance

Uh oh!

0xbrainkid left a comment

Choose a reason for hiding this comment

Uh oh!

0xNyk commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants