feat(compose): optional gpu-coordinator-proxy + ollama services for VRAM contention#650
feat(compose): optional gpu-coordinator-proxy + ollama services for VRAM contention#650nnnet wants to merge 1 commit into
Conversation
…RAM contention ## Summary Adds two **optional** docker compose services for operators running both LMStudio and Ollama on a single GPU: - **`gpu-coordinator-proxy`** — fronts LMStudio (`:1234`) and Ollama (`:11434`) on alternative ports (`:1235`, `:11435`) with a shared VRAM lock. Before forwarding to either backend it unloads everything from the *other* runtime; with `GPU_FREE_STRATEGY=wipe-all` it can also evict everything from the requested runtime itself for cold-start determinism. - **`ollama`** — official `ollama/ollama:latest` image with a named volume for model storage, exposing `:11434/v1`. The proxy lives in its own repo at https://github.com/nnnet/gpu-coordinator-proxy and is cloned next to MC as `./gpu-coordinator-proxy-src/` (gitignored — the sibling clone is not part of MC's tree). The `build.context` in the compose file points at that sibling clone. ## Why this is useful upstream Operators running both runtimes on a single dev machine would otherwise have to manually unload models between provider switches. With the proxy, MC and the gateway just point `lmstudio.baseUrl=http://host.docker.internal:1235/v1` and `ollama.baseUrl=http://host.docker.internal:11435/v1` and the contention is invisible. The 5/6 swap matrix tested cleanly on a 24 GB RTX 5090. The change is purely additive — omit the two service blocks in `docker-compose-openclaw.yml` to keep the existing direct-LMStudio / direct-Ollama paths unchanged. ## Test plan 1. Clone the proxy: `git clone https://github.com/nnnet/gpu-coordinator-proxy gpu-coordinator-proxy-src` (next to `mission-control/`). 2. `make up gpu-coordinator-proxy ollama` — both services healthy. 3. Pull a model: `docker exec mc-ollama ollama pull gpt-oss:20b`. 4. Point an MC agent at `http://127.0.0.1:1235/v1` (LMStudio) and another at `http://127.0.0.1:11435/v1` (Ollama). 5. Dispatch tasks alternating between the two; observe in proxy logs that the other runtime gets evicted on each switch. ## Dependencies - Touches `docker-compose-openclaw.yml` — same file as builderz-labs#649 (OpenClaw integration). Reviewers can land in either order; later one rebases. ## Provenance Squashes our fork's commit: - `3249ee7 feat(compose): add gpu-coordinator-proxy service from sibling repo clone`
0xbrainkid
left a comment
There was a problem hiding this comment.
Summary
- Blocking: the optional GPU proxy is included as an always-on service with a required local build context, and the compose file also carries a machine-specific OpenClaw state path.
Strengths
- The proxy/ports/eviction-strategy comments are clear and explain the intended LMStudio/Ollama topology well.
- Keeping Ollama model data in a named volume is the right operational default.
Issues
- [BLOCKING]
gpu-coordinator-proxyis part of the default compose graph but builds from./gpu-coordinator-proxy-src, which is intentionally gitignored and absent on a normal clone.docker compose up/make upwill fail unless the operator manually clones the sibling repo or edits the compose file, even though the comments call the proxy optional. Please put this service behind a profile (for exampleprofiles: ["gpu-proxy"]) or split it into an optional override compose file so the default OpenClaw stack starts without the external repo. - [BLOCKING] Same portability issue as the OpenClaw integration compose:
OPENCLAW_STATE_DIR,OPENCLAW_PLUGIN_STAGE_DIR,OPENCLAW_CONFIG_PATH, and the state bind mount hard-code/mnt/9/gt/rig_PlatformsAI/mayor/rig/beads/discovered/mission-control/.openclaw-data. That path will be wrong for other clones/hosts. Please make it configurable/generated locally rather than committed.
Questions
- Should Ollama also be profile-gated, or is it intended to be part of the default OpenClaw stack even for cloud-only users?
Verification
gh pr checks 650: no checks reported.docker compose -f docker-compose-openclaw.yml config --quiet: passed syntax/config rendering, but does not validate that the optional build context exists or that the committed absolute path is portable.
|
Thanks — holding this one until #649 lands. They both modify Once #649 lands, please rebase this onto the new One question while you're rebasing: The proxy service description mentions Tagging as |
Summary
Adds two optional docker compose services for operators running both LMStudio and Ollama on a single GPU:
gpu-coordinator-proxy— fronts LMStudio (:1234) and Ollama (:11434) on alternative ports (:1235,:11435) with a shared VRAM lock. Before forwarding to either backend it unloads everything from the other runtime; withGPU_FREE_STRATEGY=wipe-allit can also evict everything from the requested runtime itself for cold-start determinism.ollama— officialollama/ollama:latestimage with a named volume for model storage, exposing:11434/v1.The proxy lives in its own repo at https://github.com/nnnet/gpu-coordinator-proxy and is cloned next to MC as
./gpu-coordinator-proxy-src/(gitignored — the sibling clone is not part of MC's tree). Thebuild.contextin the compose file points at that sibling clone.Why this is useful upstream
Operators running both runtimes on a single dev machine would otherwise have to manually unload models between provider switches. With the proxy, MC and the gateway just point
lmstudio.baseUrl=http://host.docker.internal:1235/v1andollama.baseUrl=http://host.docker.internal:11435/v1and the contention is invisible. The 5/6 swap matrix tested cleanly on a 24 GB RTX 5090.The change is purely additive — omit the two service blocks in
docker-compose-openclaw.ymlto keep the existing direct-LMStudio / direct-Ollama paths unchanged.Test plan
git clone https://github.com/nnnet/gpu-coordinator-proxy gpu-coordinator-proxy-src(next tomission-control/).make up gpu-coordinator-proxy ollama— both services healthy.docker exec mc-ollama ollama pull gpt-oss:20b.http://127.0.0.1:1235/v1(LMStudio) and another athttp://127.0.0.1:11435/v1(Ollama).Dependencies
docker-compose-openclaw.yml— same file as feat(openclaw): additive Docker integration with env-driven hardening + doctor cleanup #649 (OpenClaw integration). Reviewers can land in either order; later one rebases.Provenance
Squashes our fork's commit:
3249ee7 feat(compose): add gpu-coordinator-proxy service from sibling repo clone