-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRAFT for Feedback - Support for token streaming for more dynamic UX #4443
base: main
Are you sure you want to change the base?
Conversation
@microsoft-github-policy-service agree
From: microsoft-github-policy-service[bot] ***@***.***>
Date: Sunday, December 1, 2024 at 9:45 AM
To: microsoft/autogen ***@***.***>
Cc: jspv ***@***.***>, Mention ***@***.***>
Subject: Re: [microsoft/autogen] DRAFT for Feedback - Support for token streaming for more dynamic UX (PR #4443)
@jspv<https://github.com/jspv> please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
[...]
|
Yes, that would work. I considered that approach but as it be a potentially breaking change, I avoided it for my testing. Happy to take a stab at it. The callers pulling from the iterator would need to be able to differentiate between tokens being returned vs. other types of messages coming back (tool call, etc.); it could be as simple as type As callers of on_messages_stream may want the full results (call for tools, etc.) streamed back but not the tokens, how it currently works with the underlying client call model_client.create() vs. model_client.create_stream()), there would need to be a way to signal to on_messages_stream that token streaming is desired. e.g. |
Thinking more on this. An advantage of the callback model vs. async iterator is that it works perfectly when invoking group chats, E.g. What I think would make sense is to accept a list of callbacks (Langchain does this) on Agent instantiation, or alternatively create methods for registering and removing callbacks from the agent. If there are callbacks listed, the agent will use Effectively this cleanly separates streamed tokens into their own path for getting to the UI for those who want them (as I really think this is only a UI need), and leaves all the 'normal' paths for group chats and inter-agent communications. Thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 1 out of 1 changed files in this pull request and generated no suggestions.
Comments skipped due to low confidence (4)
python/packages/autogen-agentchat/src/autogen_agentchat/agents/_assistant_agent.py:260
- Ensure that the token_callback is an async function before using await. Add a check to verify if the token_callback is an async function.
if self._token_callback is not None:
python/packages/autogen-agentchat/src/autogen_agentchat/agents/_assistant_agent.py:188
- [nitpick] The error message should be more informative. Suggestion: 'The model does not support function calling, which is required for the provided tools.'
raise ValueError("The model does not support function calling.")
python/packages/autogen-agentchat/src/autogen_agentchat/agents/_assistant_agent.py:271
- [nitpick] The error message should be more informative. Suggestion: 'Unsupported tool type provided. Expected Tool or callable, but got {type(tool)}.'
raise ValueError(f"Unsupported tool type: {type(tool)}")
python/packages/autogen-agentchat/src/autogen_agentchat/agents/_assistant_agent.py:330
- [nitpick] The error message should be more informative. Suggestion: 'Unsupported handoff type provided. Expected HandoffBase or str, but got {type(handoff)}.'
raise ValueError(f"Unsupported handoff type: {type(handoff)}")
@jspv since this feature is targeting 0.4.1, do you want to join our discord channel so we can discuss? https://aka.ms/autogen-discord |
@jspv Agent output should go via the runtime (message publishing). The reason this is important is so that cross process communication works as expected. The callback approach will only work in a single process. While agentchat is only currently single process, we are expanding it to work with the same distributed expectations of core in an upcoming release. So, we will likely get to tackling partial message streaming in However, in saying all this, if callbacks work well for you and the constraints I mentioned above don't apply to you then I would encourage you to use them! Given the modular architecture of |
…v/autogen into ToolCallResultSummaryMessage
Understood. Thanks for the feedback; happy to assist where I can. My thinking on high level requirements so far is:
Does this seem reasonable? Is there a natural approach to modifying the message structure to support this? - I'm happy to prototype the change. |
Why are these changes needed?
ChatCompletionClient nicely supports token level streaming via
create_stream
, but this method is currently not accessible in the AssistantAgent. This proposed change adds an option to pass atoken_callback
when instantiating AssistantAgent, if provided:create_stream
will be leveraged instead ofcreate
when callingon_messages_stream
This will allow the calling application access to the returned tokens real-time. Nothing else is changed, the normal returns to
on_messages_streams
are not affected.Example:
If folks feel this a good idea, I will make appropriate updates in documentation and tests.
Related issue number
Checks