Skip to content

Conversation

@alcholiclg
Copy link
Collaborator

Change Summary

  1. improve context in fin_research
  2. update readme in code tool
  3. support for spec_loader for fin_research
  4. support for agent tool
  5. support timeout interruption for docling
  6. support jina_reader

Related issue number

Checklist

  • The pull request title is a good summary of the changes - it will be used in the changelog
  • Unit tests for the changes exist
  • Run pre-commit install and pre-commit run --all-files before git commit, and passed lint check.
  • Documentation reflects the changes where applicable

@gemini-code-assist
Copy link

Summary of Changes

Hello @alcholiclg, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the fin_research application by improving its user interface, introducing a new AgentTool for better modularity, and upgrading document loading capabilities with timeout handling and JinaReader integration. A major refactoring replaces the principle_skill with a versatile SpecLoader to manage various reporting specifications, complemented by updated agent prompts that leverage these new tools and provide clearer operational guidelines.

Highlights

  • Enhanced Financial Research UI: The fin_research application's user interface has been updated with a new introductory section providing a detailed description of the multi-agent analysis engine, improved search API key placeholder text, and a new 'Examples' link.
  • Introduced Agent Tool: A new AgentTool class has been added, allowing existing ms-agent agents to be exposed as callable tools, significantly enhancing the modularity and reusability of agent functionalities.
  • Improved Docling Timeout Handling: The docling document loader now includes timeout interruptions for network requests and utilizes a ThreadPoolExecutor with a configurable number of workers for better performance and stability when processing URLs.
  • Added Jina Reader Support: A new JinaReader utility has been integrated to asynchronously fetch and preprocess text content from URLs, including retry mechanisms and backoff strategies for robust web content retrieval.
  • Refactored Principle/Spec Loading: The principle_skill tool has been replaced by a more generic SpecLoader tool, which now supports loading both 'principle specs' and new 'writing specs' for financial reports, along with dedicated Markdown files for these specifications.
  • Updated Agent Prompts: System prompts across aggregator, analyst, collector, and orchestrator agents have been refined to improve context, workflow instructions, tool calling protocols, and safety constraints, including explicit guidance on using the new spec_loader.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant enhancements to the ms-agent framework, primarily by adding a new AgentTool that allows agents to call other agents as sub-tools, improving modularity and reusability. It also integrates a JinaReader for robust asynchronous fetching and processing of web content with retry logic and timeouts. The document loading (doc_loader.py) is made more resilient and concurrent by adjusting requests timeouts and utilizing ThreadPoolExecutor for URL validation. The fin_research application's UI is updated with a new introductory section, an 'Examples' link, and refined placeholder/status messages. Crucially, the aggregator.yaml agent's prompt is updated to enforce a multi-phase workflow, requiring continuous tool usage between phases, and introduces a new spec_loader tool to replace the principle_skill for retrieving both principle and writing style specifications. Review comments highlighted areas for improvement in AgentTool's message role handling, JinaReader's broad exception catching, and the aggregator agent's tool calling protocol to ensure clarity and robustness, along with a suggestion to return structured sections from spec_loader for better LLM processing.

Comment on lines +266 to +274
Message(
role=msg.get('role', 'user'),
content=msg.get('content', ''),
tool_calls=msg.get('tool_calls', []),
tool_call_id=msg.get('tool_call_id'),
name=msg.get('name'),
reasoning_content=msg.get('reasoning_content', ''),
) for msg in raw_messages # TODO: Change role to user or not
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The TODO comment on line 273 (# TODO: Change role to user or not) suggests some uncertainty about how message roles should be handled when passing them to a sub-agent. While the current implementation defaults to the 'user' role if it's missing, this might not be appropriate for all sub-agents, which could expect a more structured conversation history with alternating user and assistant roles.

To improve robustness and clarity, it would be beneficial to resolve this TODO. If the sub-agent is designed to process a sequence of messages, preserving the original roles is crucial. If a role is missing from the input, logging a warning could help diagnose issues during integration.

Comment on lines +86 to +94
except Exception:
# Unknown error; do not loop excessively
if attempt <= config.retries:
sleep_s = min(config.backoff_max,
config.backoff_base * (2**(attempt - 1)))
sleep_s *= random.uniform(0.7, 1.4)
time.sleep(sleep_s)
continue
return ''

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The except Exception: block catches all exceptions and retries. This is generally too broad and can mask underlying bugs in the code (like TypeError, ValueError, etc.) by treating them as transient network errors. It's better to catch specific, expected exceptions that are known to be retryable.

I recommend refining the exception handling to catch more specific, retryable exceptions, such as socket.timeout or other connection-related errors from urllib. Unexpected exceptions should be allowed to propagate or at least be logged as errors before returning, so they can be debugged and fixed.

        except Exception as e:
            # Unknown error; fail fast and log it for debugging.
            # from ms_agent.utils import get_logger
            # logger = get_logger()
            # logger.warning("Unexpected error fetching URL %s: %s", url, e)
            return ''

Comment on lines 82 to 89
<Tool_Calling_Protocol>
- Use standard OpenAI function calling to invoke tools. \
Do NOT output code in assistant's natural language output.
- Every turn MUST include at least one tool call, unless you're providing the FINAL summary.
- Use standard OpenAI function calling to invoke tools. Do NOT output code in assistant's natural language output.
- If you use [ACT=code], [ACT=collect], or [ACT=fix], you MUST include at least one tool call in that turn.
- If you use [ACT=report], you MUST output the comprehensive summary in markdown format and MUST NOT call any tools in that turn.
- After each tool call, carefully review the output.
- State explicitly what you learned and what comes next.
- Continue calling tools until you have sufficient evidence to conclude.
- When analysis is complete and you need to provide a comprehensive summary, \
you can use [ACT=report] without tools and stop.
</Tool_Calling_Protocol>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new tool calling protocol is more specific, which is great. However, by removing the old rule (Every turn MUST include at least one tool call, unless you're providing the FINAL summary.), a potential ambiguity is introduced. The new rules only cover turns with [ACT=code], [ACT=collect], [ACT=fix], or [ACT=report].

If the agent produces a response without one of these tags, the protocol is now ambiguous. To make the instructions more robust against unexpected agent behavior, consider adding a catch-all rule.

For example, you could add:
Unless you are using [ACT=report] to provide the final summary, every turn MUST include at least one tool call.

Comment on lines +311 to +317
return json.dumps(
{
'success': True,
'sections': join_with.join(sections)
},
ensure_ascii=False,
indent=2)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In load_specs, when format is 'markdown', the sections list is joined into a single string. This differs from the previous implementation in principle_skill.py, which returned a list of strings.

While this might be intentional, returning a single block of text can make it harder for the LLM to distinguish between different specs. Providing a list of sections offers more structure, which can help the agent process each spec individually. If a single string is needed, the agent can easily join the list itself.

Suggested change
return json.dumps(
{
'success': True,
'sections': join_with.join(sections)
},
ensure_ascii=False,
indent=2)
return json.dumps(
{
'success': True,
'sections': sections
},
ensure_ascii=False,
indent=2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant