Skip to content

Conversation

@doringeman
Copy link
Contributor

If the runner is defunct, then loader.load will evict it, which means it'll also remove the recorded OpenAI requests/responses for that model. Hence, record the request after loader.load returns so it doesn't get removed in the loading process.

Without this patch, following these steps will result in a lost pair of request-response, the first one after the runner exited/was killed and had to be evicted because it was defunct.

$ MODEL_RUNNER_PORT=8080 make run # in a separate terminal

$ MODEL_RUNNER_HOST=http://localhost:8080 docker model run ai/smollm2 hi

$ curl -s http://localhost:8080/engines/requests\?model\=ai/smollm2 # all good, 1 record

$ pkill com.docker.llama-server

$ MODEL_RUNNER_HOST=http://localhost:8080 docker model run ai/smollm2 hi

$ # See `ERRO[0011] Model ai/smollm2 not found in records - 200` in the terminal where model-runner is running.
# This is because the request was recorded, then `loader.load` removed it, 
and then via a deferred function we attempted to record the response, but no matching request was found.
# Hence, we lost the first pair of request-response.

$ curl -s http://localhost:8080/engines/requests\?model\=ai/smollm2
No records found for model 'ai/smollm2'

@doringeman doringeman requested a review from a team June 26, 2025 12:28
Copy link

@p1-0tr p1-0tr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@doringeman doringeman merged commit fcf45f4 into docker:main Jun 27, 2025
1 check passed
doringeman added a commit to doringeman/model-runner that referenced this pull request Oct 2, 2025
context: Use cli's current context instead of "desktop-linux"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants