-
Notifications
You must be signed in to change notification settings - Fork 671
PoC: InferenceClient
is also a MCPClient
#2986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
mcp_client.py
Outdated
self.exit_stack = AsyncExitStack() | ||
self.available_tools: List[ChatCompletionInputTool] = [] | ||
|
||
async def add_mcp_server(self, command: str, args: List[str]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not name this method connect_to_server
?
we can't add multiple mcp servers to single instance of client, can we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i mean we would need to store a map of sessions
, but there's nothing preventing us from doing it, conceptually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me know if the following question is out of scope.
Question: how would I connect to multiple MCP servers? Would it look like option 1 or option 2?
Option 1
client1 = MCPClient()
client1.add_mcp_server()
client2 = MCPClient()
client2.add_mcp_server()
or
Option 2
client = MCPClient()
client.add_mcp_server(server1)
client.add_mcp_server(server2)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another design question: class MCPClient(AsyncInferenceClient)
vs class AsyncInferenceClient(...args, mcp_clients: MCPClient[])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to chime in unannounced, but from a very removed external user standpoint, I find this all very confusing - I just don't think what you coded should be called MCPClient
😄
When I came to this PR I was fully expecting MCPClient
to be passed as parameter to InferenceClient
, though I hear @Wauplin above, so why not a wrapper. But the end result is really more of an InferenceClientWithEmbeddedMCP
to me, not an MCPClient
.
That being said, it's just about semantics, but I'm kind of a semantics extremist, sorry about that (and feel free to completely disregard this message, as is very likely XD)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was fully expecting
MCPClient
to be passed as parameter toInferenceClient
What do you mean as parameter? Do you have an example signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
second option of #2986 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, sure, we can probably add this I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah actually with the async/await stuff i'm not so sure.
Co-authored-by: Mishig <[email protected]>
… AsyncInferenceClient" This reverts commit 2c7329c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments out of my head, SSE supports would not be so hard to add and could really be a nice addition
self.exit_stack = AsyncExitStack() | ||
self.available_tools: List[ChatCompletionInputTool] = [] | ||
|
||
async def add_mcp_server(self, command: str, args: List[str], env: Dict[str, str]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you would need to lighten a bit the requirements on your args if you want to make it work with SSE or the intent is just to support STDIO ? I see the rest seems to focus on stdio so maybe it's by design
"function": { | ||
"name": tool.name, | ||
"description": tool.description, | ||
"parameters": tool.inputSchema, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a note that I have seen some MCP servers with jsonref in their description which sometimes confuses the model. In mcpadapt I had to resolve the jsonref before passing it to the model, might be minor for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confused or sometime plain unsupported by the model sdk like google genai...
ToolName: TypeAlias = str | ||
|
||
|
||
class MCPClient: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could potentially also support 2 syntaxes like in the MCPClient for smolagents:
- like you have with try and finally
- one as a context manager, where you directly pass the the MCP client settings + the MCP servers with the tools you want to have this would allow to do something like "with MCPClient(...) as client:" and then just run client.chat.completion with the tools
Required reading
https://modelcontextprotocol.io/quickstart/client
TL;DR: MCP is a standard API to expose sets of Tools that can be hooked to LLMs
Summary of how to use this
Open questions
(Async)InferenceClient
directly? Or on a distinct class, like here?Where to find the MCP Server used here as an example
Note that you can replace it with any MCP Server, from this doc for instance: https://modelcontextprotocol.io/examples
https://gist.github.com/julien-c/0500ba922e1b38f2dc30447fb81f7dc6
Script output
Generation from LLM with tools
3D Model Generation from Text
Here are some of the best apps that can generate 3D models from text:
Shap-E
Hunyuan3D-1
LGM
3D-Adapter
Fictiverse-Voxel_XL_Lora
Best Paper on Transformers
One of the most influential and highly cited papers on transformers is:
If you are looking for more recent advancements, here are a few other notable papers:
Title: "RoFormer: Enhanced Transformer with Rotary Position Embedding"
Link: huggingface.co/papers/2104.09864
Description: This paper introduces RoFormer, which improves the Transformer by using rotary position embeddings.
Title: "Performer: Generalized Attention with Gaussian Kernels for Sequence Modeling"
Link: huggingface.co/papers/2009.14794
Description: This paper presents Performer, a more efficient and scalable version of the Transformer.
Title: "Longformer: The Long-Document Transformer"
Link: huggingface.co/papers/2004.05150
Description: This paper introduces Longformer, which extends the Transformer to handle very long documents.
These resources should provide you with a solid foundation for both generating 3D models from text and understanding the latest advancements in transformer models.