Fix TRL vLLM server: implement health check and completion types#345
Fix TRL vLLM server: implement health check and completion types#345Forgingalex wants to merge 1 commit intoNousResearch:mainfrom
Conversation
|
Key changes I made include fixing Issue #183 (ensuring server.completion() returns an object with a .text attribute), implementing a server health check via the /generate endpoint, and clarifying that logprobs are not currently supported by TRL's server. Also verified everything with a new test suite. gNous |
|
Hey there! Can you give an example of how you used this so we can verify on our end? |
|
Quick heads up: the current tests don't hit the actual failure path from #183. They only instantiate OpenAI types directly, so they'd pass even if the wrapper still returns the wrong shape. I added a regression test locally that calls TrlVllmServer._completion_wrapper() with a mocked /generate response and asserts choices[0].text exists. That directly locks the bug. Happy to open a PR if y'all want. |
I fixed the TrlVllmServer implementation to ensure correct integration with TRL's vLLM server.
Also implemented the missing health check by directly pinging the generation endpoint since TRL lacks a standard health route. Made corrections to the completion wrappers to return valid Completion objects with the required
text attribute, preventing runtime errors.
Finally, I handled the missing logprobs functionality by raising a specific error that explains the limitation and added a new test suite to verify these changes.