Fix unforked llamacpp #546

doringeman · 2026-01-08T12:53:56Z

Some fixes for #541.

fix(llamacpp): add missing cmake flags

make -C llamacpp build

MODEL_RUNNER_PORT=8080 make run LOCAL_LLAMA=1

style(llamacpp): use spaces for consistent help output alignment

$ make -C llamacpp help
Available targets:
  build        - Build llama.cpp (macOS only for now)
  install-deps - Install build dependencies
  build-dir    - Print build directory path
  install-dir  - Print install directory path
  clean        - Clean build artifacts
  help         - Show this help

fix(llamacpp): install renamed binary correctly

WARN[0066] Initialization for llama.cpp backend runner with model sha256:436bb282b41968a83638482999980267ca8d7e8b5574604460efa9efff11cf59 in completion mode failed: llama.cpp terminated unexpectedly: llama.cpp exit status: exit status 71
with output: sandbox-exec: execvp() of '/Users/dorin/workspace/model-runner/llamacpp/install/bin/com.docker.llama-server' failed: No such file or directory

fix(llamacpp): fix rpath for macOS binary

Failed to generate a response: error response: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp exit status: signal: abort trap
with output: dyld[2388]: Library not loaded: @rpath/libmtmd.0.dylib
  Referenced from: <CDC77272-5667-3542-BD3F-DE2849AA607F> /Users/dorin/workspace/model-runner/llamacpp/install/bin/com.docker.llama-server
  Reason: tried: '/Users/dorin/workspace/model-runner/llamacpp/build/bin/libmtmd.0.dylib' (file system sandbox blocked mmap() of '/Users/dorin/workspace/model-runner/llamacpp/build/bin/libmtmd.0.0.7356.dylib'), '/System/Volumes/Preboot/Cryptexes/OS/Users/dorin/workspace/model-runner/llamacpp/build/bin/libmtmd.0.dylib' (no such file), '/Users/dorin/workspace/model-runner/llamacpp/build/bin/libmtmd.0.dylib' (file system sandbox blocked mmap() of '/Users/dorin/workspace/model-runner/llamacpp/build/bin/libmtmd.0.0.7356.dylib'), '/System/Volumes/Preboot/Cryptexes/OS/Users/dorin/workspace/model-runner/llamacpp/build/bin/libmtmd.0.dylib' (no such file)

fix(scheduling): use /health endpoint for backend readiness checks
- addresses Unfork llama.cpp #472 (comment)
chore(llamacpp): use $(CURDIR) instead of $(shell pwd)
- thanks Gemini! Fix unforked llamacpp #546 (comment)

Signed-off-by: Dorin Geman <[email protected]>

gemini-code-assist · 2026-01-08T12:54:08Z

Summary of Changes

Hello @doringeman, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the llamacpp build configuration by incorporating necessary CMake flags to enable the compilation of core components, including the common library, server, and associated tools. Additionally, it includes a minor aesthetic improvement to the Makefile's help message for better readability.

Highlights

Missing CMake Flags Added: Introduced critical CMake flags (DLLAMA_BUILD_COMMON, DLLAMA_BUILD_SERVER, DLLAMA_BUILD_TOOLS) to the llamacpp build process, ensuring that the common library, server, and tools components are properly enabled during compilation.
Help Output Alignment: Adjusted the help target in the llamacpp/Makefile to use spaces instead of tabs for consistent and improved alignment of the available target descriptions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sourcery-ai

Hey - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

gemini-code-assist

Code Review

This pull request adds missing CMake flags to the llamacpp/Makefile to ensure the necessary components of llama.cpp are built correctly. It also includes a stylistic change to improve the alignment of the help command's output. My review focuses on the alignment fix, suggesting a more robust method using printf to guarantee consistent formatting and improve maintainability.

llamacpp/Makefile

Signed-off-by: Dorin Geman <[email protected]>

After PR docker#541 removed the custom llama.cpp server fork, the /v1/models endpoint returns 200 during model loading (upstream allows it through middleware). This caused the readiness check to pass prematurely before the model was actually ready for inference. Switch to /health which properly returns 503 during loading and 200 only when the backend is fully ready. Signed-off-by: Dorin Geman <[email protected]>

sourcery-ai

Hey - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

doringeman · 2026-01-08T14:22:46Z

/gemini review

gemini-code-assist

Code Review

This pull request provides a set of fixes for the llamacpp backend. The changes include adding necessary CMake flags, correcting the rpath for macOS binaries, fixing the installation logic for the server binary, and improving the readiness check to use a dedicated /health endpoint. These changes appear to be correct and well-motivated, enhancing the reliability of the build and execution of the llamacpp backend. I have one minor suggestion to improve the Makefile by using a built-in variable for better efficiency and adherence to best practices.

gemini-code-assist · 2026-01-08T14:26:06Z

llamacpp/Makefile

 	rm -rf $(INSTALL_DIR)/lib/pkgconfig
 	rm -rf $(INSTALL_DIR)/include
+	@echo "Fixing rpath..."
+	install_name_tool -delete_rpath "$(shell pwd)/$(BUILD_DIR)/bin" $(INSTALL_DIR)/bin/com.docker.llama-server


Using the built-in $(CURDIR) variable is preferred over $(shell pwd). It's more efficient as it avoids forking a shell process and is a standard practice in GNU Makefiles.

install_name_tool -delete_rpath "$(CURDIR)/$(BUILD_DIR)/bin" $(INSTALL_DIR)/bin/com.docker.llama-server

Done in chore(llamacpp): use $(CURDIR) instead of $(shell pwd).

ericcurtin

LGTM, the CURDIR nit is probably worth picking up, assuming it works

More efficient as it avoids forking a shell process. Signed-off-by: Dorin Geman <[email protected]>

doringeman added 2 commits January 8, 2026 14:42

fix(llamacpp): add missing cmake flags

6c3db6b

Signed-off-by: Dorin Geman <[email protected]>

style(llamacpp): use spaces for consistent help output alignment

a8b3c91

Signed-off-by: Dorin Geman <[email protected]>

sourcery-ai bot reviewed Jan 8, 2026

View reviewed changes

doringeman marked this pull request as draft January 8, 2026 12:54

gemini-code-assist bot reviewed Jan 8, 2026

View reviewed changes

llamacpp/Makefile Show resolved Hide resolved

fix(llamacpp): install renamed binary correctly

aa3c859

Signed-off-by: Dorin Geman <[email protected]>

doringeman mentioned this pull request Jan 8, 2026

Remove llama.cpp server fork and build upstream #541

Merged

doringeman added 2 commits January 8, 2026 15:32

fix(llamacpp): fix rpath for macOS binary

591986a

Signed-off-by: Dorin Geman <[email protected]>

doringeman marked this pull request as ready for review January 8, 2026 14:13

sourcery-ai bot reviewed Jan 8, 2026

View reviewed changes

doringeman changed the title ~~fix(llamacpp): add missing cmake flags~~ Fix unforked llamacpp Jan 8, 2026

gemini-code-assist bot reviewed Jan 8, 2026

View reviewed changes

ericcurtin approved these changes Jan 8, 2026

View reviewed changes

chore(llamacpp): use $(CURDIR) instead of $(shell pwd)

9bf9df1

More efficient as it avoids forking a shell process. Signed-off-by: Dorin Geman <[email protected]>

doringeman merged commit fe10622 into docker:main Jan 8, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix unforked llamacpp #546

Fix unforked llamacpp #546

Uh oh!

doringeman commented Jan 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 8, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Uh oh!

doringeman commented Jan 8, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Uh oh!

doringeman Jan 8, 2026

Uh oh!

ericcurtin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix unforked llamacpp #546

Fix unforked llamacpp #546

Uh oh!

Conversation

doringeman commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

doringeman commented Jan 8, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

doringeman Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

ericcurtin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

doringeman commented Jan 8, 2026 •

edited

Loading