Skip to content

Conversation

@qbey
Copy link
Member

@qbey qbey commented Nov 3, 2025

Purpose

When the tool call fails, we don't want the user to be blocked without any clue.

This adds:

  • auto retry when possible
  • nice message preventing the LLM to generate an answer without updated information

Proposal

  • use ModelRetry exception to allow Pydantic AI to manage exception properly
  • add decorator to manage "no more retry" while preventing the LLM to build an answer without fresh information
  • add tests

Summary by CodeRabbit

  • New Features

    • Web search and RAG document workflows converted to async for non-blocking fetch, summarization, and store operations.
    • Tools gain soft-retry behavior (limited automatic retries) and clearer non-retryable error types; some tools now accept execution context.
  • Bug Fixes

    • Improved handling of rate limits, timeouts, HTTP and document-fetch errors with better retry semantics and fallbacks.
  • Tests

    • Expanded async and retry-focused tests, including document-backend and formatting scenarios.
  • Documentation

    • Changelog updated about resilient web-search behavior.

@qbey qbey self-assigned this Nov 3, 2025
@qbey qbey added enhancement New feature or request backend labels Nov 3, 2025
@coderabbitai
Copy link

coderabbitai bot commented Nov 3, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Adds async variants for RAG backends and Brave web-search flows, introduces retry-related exceptions and a soft-fail retry wrapper, enables retries for several web-search tools, and expands tests to cover async behavior and retry semantics.

Changes

Cohort / File(s) Summary
RAG Backend Async Support
src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py, src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py
Added async methods acreate_collection, adelete_collection, astore_document, asearch; BaseRagBackend provides async wrappers (via sync_to_async) and temporary_collection_async; AlbertRagBackend implements httpx-based async HTTP calls and uses AsyncClient with per-request timeouts.
Web Search Async Refactor
src/backend/chat/tools/web_search_brave.py
Converted web-search flow to async: _query_brave_api_async, _fetch_url_async, _fetch_and_extract_async, _extract_and_summarize_snippets_async, _fetch_and_store_async, llm_summarize_async; made web_search_brave and web_search_brave_with_document_backend async; added async caching, concurrent fetch/store, and RAG integration; introduced WebSearchError, BraveAPIError, DocumentFetchError and new error-to-retry mappings.
Retry Exceptions & Soft-Fail Wrapper
src/backend/chat/tools/exceptions.py, src/backend/chat/tools/utils.py
Added ModelRetryLast and ModelCannotRetry; added last_model_retry_soft_fail decorator to handle ModelRetry/ModelCannotRetry, increment retry counts in RunContext, enforce max_retries, re-raise when retries remain, and return user-facing messages when exhausted.
Tool Configurations
src/backend/chat/tools/__init__.py
Set max_retries=2 for multiple web-search tools and changed web_search_brave to take context (takes_ctx=True).
Tests — Utils & Brave Web Search
src/backend/chat/tests/tools/test_utils.py, src/backend/chat/tests/tools/test_web_search_brave.py
Added tests for last_model_retry_soft_fail (metadata, normal flow, retry behaviors). Migrated Brave web-search tests to async (AsyncMock/respx/httpx/pytest-asyncio), expanded scenarios (success, failures, retries, caching, parallel processing, document-backend integration).
Changelog
CHANGELOG.md
Added Unreleased note about web-search resilience when a tool call fails.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Wrapper as last_model_retry_soft_fail
    participant Tool
    participant Ctx as RunContext

    Caller->>Wrapper: invoke tool(ctx, args...)
    Wrapper->>Tool: await tool(ctx, args...)
    alt Success
        Tool-->>Wrapper: result
        Wrapper-->>Caller: return result
    else ModelCannotRetry
        Tool-->>Wrapper: raises ModelCannotRetry(msg)
        Wrapper-->>Caller: return msg
    else ModelRetry
        Tool-->>Wrapper: raises ModelRetry(msg)
        Wrapper->>Ctx: increment retry count
        Wrapper->>Ctx: compare with max_retries
        alt Retries remain
            Wrapper-->>Caller: re-raise ModelRetry
        else Exhausted
            Wrapper-->>Caller: return composed user-facing message
        end
    end
Loading
sequenceDiagram
    participant User
    participant WS as web_search_brave
    participant API as _query_brave_api_async
    participant Fetch as _fetch_and_extract_async
    participant LL as llm_summarize_async
    participant RAG as document_store.asearch
    participant Cache as async cache

    User->>WS: query
    WS->>API: _query_brave_api_async(query)
    API->>API: httpx POST Brave API
    alt API 200
        API-->>WS: search results
    else 429/5xx
        API-->>WS: raise ModelRetry
    else 4xx
        API-->>WS: raise ModelCannotRetry
    end

    WS->>Cache: check cached extractions
    WS->>Fetch: parallel _fetch_and_extract_async(urls)
    par per-URL
        Fetch->>Fetch: fetch URL
        Fetch->>LL: summarize snippets (optional)
        LL-->>Fetch: snippet summaries
    end
    Fetch-->>WS: extracted results

    alt document-backend enabled
        WS->>RAG: temporary_collection_async + astore_document calls
        WS->>RAG: asearch(query)
        RAG-->>WS: RAG matches
        WS->>WS: merge results
    end

    WS-->>User: ToolReturn(formatted results)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

  • Focus review on:
    • src/backend/chat/tools/web_search_brave.py — async control flow, error-to-retry mapping, caching, concurrency, and RAG integration.
    • src/backend/chat/tests/tools/test_web_search_brave.py — correctness of async mocks and coverage of edge cases.
    • src/backend/chat/tools/utils.py and src/backend/chat/tools/exceptions.py — retry counting, re-raise behavior, and composed user messages.
    • src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py — httpx AsyncClient usage, timeouts, multipart upload handling, and response parsing.

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title '⚡️(web-search) keep running when tool call fails' directly aligns with the main objective of the changeset. The title clearly indicates that the primary change involves preventing the application from blocking when a web-search tool call fails by implementing retry logic. This matches the substantial changes made to the web search tools and supporting infrastructure across multiple files.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f7f5f66 and e51620a.

📒 Files selected for processing (9)
  • CHANGELOG.md (1 hunks)
  • src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (5 hunks)
  • src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (5 hunks)
  • src/backend/chat/tests/tools/test_utils.py (1 hunks)
  • src/backend/chat/tests/tools/test_web_search_brave.py (7 hunks)
  • src/backend/chat/tools/__init__.py (1 hunks)
  • src/backend/chat/tools/exceptions.py (1 hunks)
  • src/backend/chat/tools/utils.py (1 hunks)
  • src/backend/chat/tools/web_search_brave.py (5 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qbey qbey linked an issue Nov 3, 2025 that may be closed by this pull request
@qbey qbey force-pushed the qbey/web-search-tools-retry branch from 282517c to b575f5c Compare November 3, 2025 08:59
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
src/backend/chat/tools/utils.py (2)

14-28: Update docstring to match actual behavior.

The docstring states "a ModelCannotRetry exception is raised" when max retries are reached, but the actual implementation returns a string message instead (lines 44-47). Update the docstring to accurately describe the behavior.

Apply this diff:

     """
     Wrap a tool function to handle ModelRetry exceptions.
 
     If the tool function raises ModelRetry and the maximum number of retries
-    has been reached, a ModelCannotRetry exception is raised instead.
+    has been reached, a composed error message is returned to the LLM instead.
+    If the tool function raises ModelCannotRetry, the exception message is
+    returned as a string.
 
     Args:
         tool_func: The original tool function to wrap.
 
     Returns:
         A wrapped tool function with retry handling.
     """

37-37: Improve logging clarity.

Line 37 logs ctx (the entire RunContext object) instead of ctx.tool_name, which would be more informative for debugging.

Apply this diff:

-            logger.error("Tool '%s' raised ModelRetry: %s", ctx, exc.message)
+            logger.error("Tool '%s' raised ModelRetry: %s", ctx.tool_name, exc.message)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (1)

74-84: Remove redundant timeout specification.

The httpx.AsyncClient is created with timeout=settings.ALBERT_API_TIMEOUT on line 74, and then the same timeout is passed again to client.post() on line 83. The timeout specified in the AsyncClient constructor already applies to all requests made by that client, making the parameter on line 83 redundant.

Apply this diff:

         async with httpx.AsyncClient(timeout=settings.ALBERT_API_TIMEOUT) as client:
             response = await client.post(
                 self._collections_endpoint,
                 headers=self._headers,
                 json={
                     "name": name,
                     "description": description or self._default_collection_description,
                     "visibility": "private",
                 },
-                timeout=settings.ALBERT_API_TIMEOUT,
             )
             response.raise_for_status()

Apply similar changes to lines 109, 207, and 274 where the same pattern occurs.

src/backend/chat/tools/exceptions.py (1)

6-14: Remove unused ModelRetryLast exception class or document intent for future use.

Verification confirms that ModelRetryLast is never imported or used anywhere in the codebase. The last_model_retry_soft_fail decorator uses manual message construction instead. If this class is not planned for future use, it should be removed to keep the codebase clean.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1901c4d and b575f5c.

📒 Files selected for processing (9)
  • CHANGELOG.md (1 hunks)
  • src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (5 hunks)
  • src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (5 hunks)
  • src/backend/chat/tests/tools/test_utils.py (1 hunks)
  • src/backend/chat/tests/tools/test_web_search_brave.py (7 hunks)
  • src/backend/chat/tools/__init__.py (1 hunks)
  • src/backend/chat/tools/exceptions.py (1 hunks)
  • src/backend/chat/tools/utils.py (1 hunks)
  • src/backend/chat/tools/web_search_brave.py (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (4)
src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (4)
  • acreate_collection (30-35)
  • adelete_collection (98-103)
  • astore_document (64-73)
  • asearch (111-115)
src/backend/chat/agent_rag/document_search/albert_api.py (1)
  • collection_id (56-62)
src/backend/chat/agent_rag/constants.py (3)
  • RAGWebResults (27-41)
  • RAGWebResult (17-24)
  • RAGWebUsage (8-14)
src/backend/chat/agent_rag/albert_api_constants.py (1)
  • Searches (145-150)
src/backend/chat/tools/utils.py (1)
src/backend/chat/tools/exceptions.py (1)
  • ModelCannotRetry (17-23)
src/backend/chat/tests/tools/test_web_search_brave.py (4)
src/backend/chat/tools/exceptions.py (1)
  • ModelCannotRetry (17-23)
src/backend/chat/tools/web_search_brave.py (8)
  • web_search_brave (240-289)
  • DocumentFetchError (35-36)
  • _extract_and_summarize_snippets_async (110-133)
  • _fetch_and_extract_async (77-107)
  • _fetch_and_store_async (136-148)
  • _query_brave_api_async (151-215)
  • format_tool_return (218-236)
  • web_search_brave_with_document_backend (293-367)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (3)
  • asearch (253-296)
  • astore_document (187-210)
  • store_document (165-185)
src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (4)
  • asearch (111-115)
  • temporary_collection_async (131-139)
  • astore_document (64-73)
  • store_document (53-62)
src/backend/chat/tools/web_search_brave.py (4)
src/backend/chat/tools/exceptions.py (1)
  • ModelCannotRetry (17-23)
src/backend/chat/tools/utils.py (1)
  • last_model_retry_soft_fail (14-50)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (2)
  • astore_document (187-210)
  • asearch (253-296)
src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (3)
  • astore_document (64-73)
  • temporary_collection_async (131-139)
  • asearch (111-115)
src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (2)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (8)
  • acreate_collection (69-88)
  • create_collection (50-67)
  • astore_document (187-210)
  • store_document (165-185)
  • adelete_collection (101-111)
  • delete_collection (90-99)
  • search (212-251)
  • asearch (253-296)
src/backend/chat/agent_rag/constants.py (1)
  • RAGWebResults (27-41)
src/backend/chat/tools/__init__.py (3)
src/backend/chat/tools/web_search_brave.py (2)
  • web_search_brave (240-289)
  • web_search_brave_with_document_backend (293-367)
src/backend/chat/tools/web_search_tavily.py (1)
  • web_search_tavily (8-38)
src/backend/chat/tools/web_seach_albert_rag.py (1)
  • web_search_albert_rag (11-31)
src/backend/chat/tests/tools/test_utils.py (2)
src/backend/chat/tools/exceptions.py (1)
  • ModelCannotRetry (17-23)
src/backend/chat/tools/utils.py (1)
  • last_model_retry_soft_fail (14-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test-e2e-chromium
🔇 Additional comments (8)
CHANGELOG.md (1)

14-14: LGTM!

The changelog entry accurately describes the enhancement and follows the project's format conventions.

src/backend/chat/tests/tools/test_utils.py (1)

13-154: Excellent test coverage!

The tests comprehensively cover all aspects of the last_model_retry_soft_fail decorator including metadata preservation, normal execution, retry behavior, max retry handling, and non-retryable errors. The test cases are well-structured and clearly document the expected behavior.

src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (2)

30-35: LGTM!

The async wrapper methods using sync_to_async provide a clean fallback for subclasses that haven't implemented native async versions. This allows gradual migration to async while maintaining backward compatibility.


129-139: LGTM!

The async context manager correctly mirrors the synchronous version, properly handling collection creation and cleanup with async/await semantics.

src/backend/chat/tests/tools/test_web_search_brave.py (2)

57-1096: Excellent async test coverage!

The test suite comprehensively covers the async Brave search implementation including:

  • Success cases with/without extra snippets
  • Summarization flows
  • Empty results and error handling
  • Rate limiting and timeout scenarios
  • RAG backend integration
  • Edge cases like empty documents and extraction failures
  • Concurrent processing behavior
  • Cache mechanisms

The tests are well-structured, clearly documented, and thoroughly validate the new async behavior and retry logic.


837-850: Review comment is incorrect.

The test expectation is valid. web_search_brave is wrapped by the @last_model_retry_soft_fail decorator at line 239 of web_search_brave.py, which converts ModelCannotRetry exceptions into string return values. The test on lines 837-850 correctly expects a string message when the 403 status triggers this exception, consistent with the decorator's behavior.

Likely an incorrect or invalid review comment.

src/backend/chat/tools/__init__.py (1)

32-37: No issues found—configuration is correct.

The function signature confirms that web_search_tavily(query: str) does not accept a context parameter, making the takes_ctx=False configuration accurate and consistent with the implementation.

src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (1)

196-210: Code implementations are correct for their respective libraries.

The observed difference is intentional and required: the sync version uses the requests library (which allows non-file form fields within the files parameter as tuples), while the async version uses httpx (which requires separate data= and files= parameters). The async httpx approach of passing both data and files parameters is the correct pattern for httpx's multipart/form-data handling. Both implementations correctly follow their respective library's API specifications.

@qbey qbey force-pushed the qbey/web-search-tools-retry branch from b575f5c to f7f5f66 Compare November 3, 2025 12:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/backend/chat/tools/utils.py (1)

44-47: Fix typo in user-facing message.

Line 46 contains "based on you knowledge" which should be "based on your knowledge".

Apply this diff:

                 return (
                     f"{exc.message} You must explain this to the user and "
-                    "not try to answer based on you knowledge."
+                    "not try to answer based on your knowledge."
                 )
🧹 Nitpick comments (3)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (3)

69-88: Simplify timeout configuration and move response parsing inside context manager.

Two issues:

  1. Redundant timeout: Line 83 specifies timeout again, but it's already configured in the AsyncClient constructor at line 74. Remove the redundant parameter from the request call.

  2. Response parsing outside context: Line 87 calls response.json() after exiting the context manager. While this works with httpx (response is buffered), it's cleaner to parse inside the context manager before the client closes.

Apply this diff:

     async def acreate_collection(self, name: str, description: Optional[str] = None) -> str:
         """
         Create a temporary collection for the search operation.
         This method should handle the logic to create or retrieve an existing collection.
         """
         async with httpx.AsyncClient(timeout=settings.ALBERT_API_TIMEOUT) as client:
             response = await client.post(
                 self._collections_endpoint,
                 headers=self._headers,
                 json={
                     "name": name,
                     "description": description or self._default_collection_description,
                     "visibility": "private",
                 },
-                timeout=settings.ALBERT_API_TIMEOUT,
             )
             response.raise_for_status()
-
-        self.collection_id = str(response.json()["id"])
+            self.collection_id = str(response.json()["id"])
+            
         return self.collection_id

101-111: Remove redundant timeout parameter.

Line 109 specifies timeout parameter, but it's already configured in the AsyncClient constructor at line 105.

Apply this diff:

     async def adelete_collection(self) -> None:
         """
         Asynchronously delete the current collection
         """
         async with httpx.AsyncClient(timeout=settings.ALBERT_API_TIMEOUT) as client:
             response = await client.delete(
                 urljoin(f"{self._collections_endpoint}/", self.collection_id),
                 headers=self._headers,
-                timeout=settings.ALBERT_API_TIMEOUT,
             )
             response.raise_for_status()

253-296: Remove redundant timeout and consider consistency improvements.

Three issues:

  1. Redundant timeout: Line 274 specifies timeout parameter, already configured in the AsyncClient constructor at line 264.

  2. Response parsing outside context: Line 281 calls response.json() after exiting the context manager. While this works with httpx, it's cleaner to parse inside the context manager.

  3. Inconsistent logging: Line 277 adds debug logging not present in the synchronous search method (lines 212-251). Consider adding the same logging to the sync version for consistency.

Apply this diff:

     async def asearch(self, query, results_count: int = 4) -> RAGWebResults:
         """
         Perform an asynchronous search using the Albert API based on the provided query.
 
         Args:
             query (str): The search query.
             results_count (int): The number of results to return.
 
         Returns:
             RAGWebResults: The search results.
         """
         async with httpx.AsyncClient(timeout=settings.ALBERT_API_TIMEOUT) as client:
             response = await client.post(
                 urljoin(self._base_url, self._search_endpoint),
                 headers=self._headers,
                 json={
                     "collections": [int(self.collection_id)],
                     "prompt": query,
                     "score_threshold": 0.6,
                     "k": results_count,  # Number of chunks to return from the search
                 },
-                timeout=settings.ALBERT_API_TIMEOUT,
             )
 
             logger.debug("Search response: %s %s", response.text, response.status_code)
 
             response.raise_for_status()
-
-        searches = Searches(**response.json())
+            searches = Searches(**response.json())
 
-        return RAGWebResults(
-            data=[
-                RAGWebResult(
-                    url=result.chunk.metadata["document_name"],
-                    content=result.chunk.content,
-                    score=result.score,
-                )
-                for result in searches.data
-            ],
-            usage=RAGWebUsage(
-                prompt_tokens=searches.usage.prompt_tokens,
-                completion_tokens=searches.usage.completion_tokens,
-            ),
-        )
+            return RAGWebResults(
+                data=[
+                    RAGWebResult(
+                        url=result.chunk.metadata["document_name"],
+                        content=result.chunk.content,
+                        score=result.score,
+                    )
+                    for result in searches.data
+                ],
+                usage=RAGWebUsage(
+                    prompt_tokens=searches.usage.prompt_tokens,
+                    completion_tokens=searches.usage.completion_tokens,
+                ),
+            )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b575f5c and f7f5f66.

📒 Files selected for processing (9)
  • CHANGELOG.md (1 hunks)
  • src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (5 hunks)
  • src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (5 hunks)
  • src/backend/chat/tests/tools/test_utils.py (1 hunks)
  • src/backend/chat/tests/tools/test_web_search_brave.py (7 hunks)
  • src/backend/chat/tools/__init__.py (1 hunks)
  • src/backend/chat/tools/exceptions.py (1 hunks)
  • src/backend/chat/tools/utils.py (1 hunks)
  • src/backend/chat/tools/web_search_brave.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/backend/chat/tests/tools/test_utils.py
  • CHANGELOG.md
  • src/backend/chat/tools/exceptions.py
  • src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py
  • src/backend/chat/tools/init.py
🧰 Additional context used
🧬 Code graph analysis (4)
src/backend/chat/tools/web_search_brave.py (4)
src/backend/chat/tools/exceptions.py (1)
  • ModelCannotRetry (17-23)
src/backend/chat/tools/utils.py (1)
  • last_model_retry_soft_fail (14-50)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (2)
  • astore_document (187-210)
  • asearch (253-296)
src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (3)
  • astore_document (64-73)
  • temporary_collection_async (131-139)
  • asearch (111-115)
src/backend/chat/tools/utils.py (1)
src/backend/chat/tools/exceptions.py (1)
  • ModelCannotRetry (17-23)
src/backend/chat/tests/tools/test_web_search_brave.py (3)
src/backend/chat/tools/exceptions.py (1)
  • ModelCannotRetry (17-23)
src/backend/chat/tools/web_search_brave.py (8)
  • web_search_brave (234-283)
  • DocumentFetchError (35-36)
  • _extract_and_summarize_snippets_async (104-127)
  • _fetch_and_extract_async (77-101)
  • _fetch_and_store_async (130-142)
  • _query_brave_api_async (145-209)
  • format_tool_return (212-230)
  • web_search_brave_with_document_backend (287-361)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (3)
  • asearch (253-296)
  • astore_document (187-210)
  • store_document (165-185)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (4)
src/backend/chat/agent_rag/document_rag_backends/base_rag_backend.py (3)
  • acreate_collection (30-35)
  • adelete_collection (98-103)
  • astore_document (64-73)
src/backend/chat/agent_rag/document_search/albert_api.py (1)
  • collection_id (56-62)
src/backend/chat/agent_rag/constants.py (3)
  • RAGWebResults (27-41)
  • RAGWebResult (17-24)
  • RAGWebUsage (8-14)
src/backend/chat/agent_rag/albert_api_constants.py (1)
  • Searches (145-150)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: test-e2e-chromium
  • GitHub Check: test-back
  • GitHub Check: build-and-push-frontend
  • GitHub Check: build-and-push-backend
  • GitHub Check: lint-back
🔇 Additional comments (7)
src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py (1)

11-11: LGTM!

The httpx import is necessary for the async HTTP client functionality added in this file.

src/backend/chat/tools/utils.py (1)

14-50: LGTM: Clean retry wrapper implementation.

The decorator elegantly handles retry exhaustion by converting exceptions to user-facing messages, preventing the LLM from generating outdated answers. The use of functools.wraps preserves metadata, and the retry counting logic correctly checks whether the next attempt would exceed the limit.

src/backend/chat/tests/tools/test_web_search_brave.py (1)

1-1096: LGTM: Excellent async test coverage.

The test suite has been thoroughly migrated to async with proper mocking (AsyncMock, respx) and covers all critical paths: success cases with/without snippets, various error conditions (rate limiting, timeouts, client errors, server errors), retry scenarios, RAG integration, caching behavior, and edge cases like empty results and extraction failures. The fixture structure and assertions are well-designed.

src/backend/chat/tools/web_search_brave.py (4)

27-37: LGTM: Clean exception hierarchy.

The custom exception classes (WebSearchError, BraveAPIError, DocumentFetchError) provide clear semantics for different failure modes, making error handling more precise throughout the module.


164-209: LGTM: Excellent retry classification.

The error handling correctly distinguishes transient failures (429 rate limits, 5xx server errors, timeouts, connection errors) that should trigger ModelRetry from permanent failures (4xx client errors, unexpected exceptions) that should trigger ModelCannotRetry. This granular approach prevents futile retries while maximizing resilience.


233-283: LGTM: Robust async implementation with parallel processing.

The function correctly uses asyncio.gather to parallelize snippet extraction for URLs lacking extra_snippets, then validates that at least one result was successfully extracted. The decorator handles retry exhaustion gracefully by converting exceptions to user-facing messages.


286-361: LGTM: Solid RAG integration with proper resource management.

The async context manager ensures temporary collections are cleaned up, and the function correctly aggregates usage tokens from RAG operations. Using return_exceptions=True on line 317 provides extra resilience when storing multiple documents concurrently.

Comment on lines +187 to +210
async def astore_document(self, name: str, content: str) -> None:
"""
Store the document content in the Albert collection.
This method should handle the logic to send the document content to the Albert API.
Args:
name (str): The name of the document.
content (str): The content of the document in Markdown format.
"""
async with httpx.AsyncClient(timeout=settings.ALBERT_API_TIMEOUT) as client:
response = await client.post(
urljoin(self._base_url, self._documents_endpoint),
headers=self._headers,
files={
"file": (f"{name}.md", BytesIO(content.encode("utf-8")), "text/markdown"),
},
data={
"collection": int(self.collection_id),
"metadata": json.dumps({"document_name": name}), # undocumented API
},
timeout=settings.ALBERT_API_TIMEOUT,
)
logger.debug(response.json())
response.raise_for_status()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix error handling order and remove redundant timeout.

Two issues:

  1. Critical - Wrong error handling order: Lines 209-210 call response.json() before raise_for_status(). If the API returns an error status, attempting to parse JSON first may succeed with error details, but the subsequent status check will raise anyway. Check the status before attempting to parse the response.

  2. Redundant timeout: Line 207 specifies timeout again, already configured in the AsyncClient constructor at line 196.

Apply this diff:

     async def astore_document(self, name: str, content: str) -> None:
         """
         Store the document content in the Albert collection.
         This method should handle the logic to send the document content to the Albert API.
 
         Args:
             name (str): The name of the document.
             content (str): The content of the document in Markdown format.
         """
         async with httpx.AsyncClient(timeout=settings.ALBERT_API_TIMEOUT) as client:
             response = await client.post(
                 urljoin(self._base_url, self._documents_endpoint),
                 headers=self._headers,
                 files={
                     "file": (f"{name}.md", BytesIO(content.encode("utf-8")), "text/markdown"),
                 },
                 data={
                     "collection": int(self.collection_id),
                     "metadata": json.dumps({"document_name": name}),  # undocumented API
                 },
-                timeout=settings.ALBERT_API_TIMEOUT,
             )
-            logger.debug(response.json())
             response.raise_for_status()
+            logger.debug(response.json())
🤖 Prompt for AI Agents
In src/backend/chat/agent_rag/document_rag_backends/albert_rag_backend.py around
lines 187 to 210, fix two issues: move response.raise_for_status() to run
immediately after the POST (before any response.json() calls) so HTTP errors are
detected before parsing, and remove the redundant timeout parameter from the
client.post() call (the AsyncClient already has timeout configured); keep
logging the parsed JSON only after raise_for_status() succeeds.

Comment on lines +17 to +21
"""
Wrap a tool function to handle ModelRetry exceptions.
If the tool function raises ModelRetry and the maximum number of retries
has been reached, a ModelCannotRetry exception is raised instead.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docstring inaccuracy: returns string, not exception.

The docstring states "a ModelCannotRetry exception is raised instead" but the implementation on line 35 returns a string, not an exception. Please update the docstring to accurately reflect the behavior.

Apply this diff:

     """
     Wrap a tool function to handle ModelRetry exceptions.
 
     If the tool function raises ModelRetry and the maximum number of retries
-    has been reached, a ModelCannotRetry exception is raised instead.
+    has been reached, a user-facing message is returned instead.
 
     Args:
         tool_func: The original tool function to wrap.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"""
Wrap a tool function to handle ModelRetry exceptions.
If the tool function raises ModelRetry and the maximum number of retries
has been reached, a ModelCannotRetry exception is raised instead.
"""
Wrap a tool function to handle ModelRetry exceptions.
If the tool function raises ModelRetry and the maximum number of retries
has been reached, a user-facing message is returned instead.
Args:
tool_func: The original tool function to wrap.
🤖 Prompt for AI Agents
In src/backend/chat/tools/utils.py around lines 17 to 21, the docstring
incorrectly states that a ModelCannotRetry exception is raised when max retries
are reached, but the implementation returns a string instead; update the
docstring to accurately describe the behavior by stating that the wrapper
returns an error string (or a descriptive message) when the maximum number of
retries is exceeded rather than raising an exception, and briefly mention the
returned string format/context so callers know to handle a string result.

When the tool call fails, we don't want the user to be blocked
without any clue.

This adds:
 - auto retry when possible
 - nice message preventing the LLM to generate an answer without
   updated information
@qbey qbey force-pushed the qbey/web-search-tools-retry branch from f7f5f66 to e51620a Compare November 5, 2025 13:43
@qbey qbey merged commit e51620a into main Nov 5, 2025
14 of 15 checks passed
@qbey qbey deleted the qbey/web-search-tools-retry branch November 5, 2025 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Empty Message when tool fail

2 participants