update: Optimize DocAgent Architecture #2097

priyansh4320 · 2025-09-12T21:57:19Z

Why are these changes needed?

This PR Simplifies DocAgent Architecture.

Added ThreadPoolExecutor to the tool.
Achieve optimization on DocAgent.
High Speed Query Responses.
Concurrent Document Ingestions.
Added pseudo supervisor.
Added Scalability.
Added Citation Support.
Added Flexible Prompt Capability for Inner Agent.

Simplification of DocAgent Architecture for the upcoming PR.

Old Architecture

Untitled diagram _ Mermaid Chart-2025-09-13-183817

New Architecture

Untitled diagram _ Mermaid Chart-2025-09-12-220128

Related issue number

#2098

Sample Runs

import os

from dotenv import load_dotenv

from autogen.agents.experimental.document_agent import DocAgent
from autogen.llm_config import LLMConfig

load_dotenv()

llm_config = LLMConfig({
    "api_type": "openai",
    "model": "gpt-5-nano",
    "api_key": os.getenv("OPENAI_API_KEY"),
})

# ============================================================================
# 1. BASIC DOCUMENT INGESTION TESTS
# ============================================================================


def test_basic_pdf_ingestion():
    """Test: Basic PDF document ingestion
    Expected Flow: TriageAgent → TaskManager → DoclingDocIngestAgent
    Expected Result: Document parsed with Docling to markdown, stored in vector database
    Summary shows: "Ingestions: 1. document.pdf"
    """
    print("\n" + "=" * 80)
    print("TEST 1: Basic PDF Document Ingestion")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_basic_ingestion")

    res = doc_agent.run(
        message="Please ingest this PDF file: test/agentchat/contrib/graph_rag/Toast_financial_report.pdf",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# 2. BASIC QUERY-ONLY REQUESTS TESTS
# ============================================================================


def test_query_without_documents():
    """Test: Query without any ingested documents
    Expected Flow: TriageAgent identifies no ingestions needed → TaskManager → QueryAgent
    Expected Result: Empty response: "Sorry, please ingest some documents/URLs before querying"
    """
    print("\n" + "=" * 80)
    print("TEST 3: Query Without Documents")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_query_only")

    res = doc_agent.run(
        message="what is the fiscal year 2024 financial summary?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


def test_query_after_ingestion():
    """Test: Query after documents have been ingested
    Expected Flow: TaskManager → QueryAgent → execute_rag_query
    Expected Result: Vector database queried, answer with citations returned
    Summary shows query and detailed answer
    """
    print("\n" + "=" * 80)
    print("TEST 4: Query After Document Ingestion")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_query_after_ingestion")

    # First ingest a document
    ingest_res = doc_agent.run(
        message="Please ingest this PDF file: test/agentchat/contrib/graph_rag/Toast_financial_report.pdf",
        max_turns=1,
        llm_config=llm_config,
    )
    ingest_res.process()

    # Then query it
    query_res = doc_agent.run(
        message="What are the key findings?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = query_res.process()
    return result


# ============================================================================
# 3. COMBINED INGESTION + QUERY REQUESTS TESTS
# ============================================================================


def test_combined_ingestion_and_query():
    """Test: Combined ingestion and query in single request
    Expected Flow: TriageAgent creates DocumentTask with ingestion + query
    TaskManager → DoclingDocIngestAgent (PDF processing) → QueryAgent (conclusion answers)
    Expected Result: Summary shows both ingestion and query results with citations
    """
    print("\n" + "=" * 80)
    print("TEST 5: Combined Ingestion and Query")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_combined")

    res = doc_agent.run(
        message="Please analyze this report test/agentchat/contrib/graph_rag/Toast_financial_report.pdf and tell me the main conclusions",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# 4. URL INGESTION QUERIES TESTS
# ============================================================================


def test_pdf_url_ingestion_and_summary():
    """Test: Ingest PDF from URL and summarize
    Expected Flow: PDF downloaded directly, parsed with Docling, query answered from ingested content
    Expected Result: Summary of abstract returned
    """
    print("\n" + "=" * 80)
    print("TEST 8: PDF URL Ingestion and Summary")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_pdf_url")

    res = doc_agent.run(
        message="Ingest https://arxiv.org/pdf/1706.03762 and summarize the abstract",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result



# ============================================================================
# 5. CITATION-ENABLED QUERIES TESTS
# ============================================================================


def test_citation_enabled_query():
    """Test: Query with citations enabled
    Expected Flow: Query processed with citations, answer includes source chunks and file paths
    Expected Result: Summary formats citations as: "Source [X] (chunk file_path):\n\nChunk X:\n(text_chunk)"
    """
    print("\n" + "=" * 80)
    print("TEST 10: Citation-Enabled Query")
    print("=" * 80)

    # Create DocAgent with citation support enabled
    doc_agent = DocAgent(
        llm_config=llm_config,
        collection_name="test_citations",
        enable_citations=True,  # Enable citations
        citation_chunk_size=512,  # Optional: customize citation chunk size
    )

    # Rest of your test remains the same...
    # First ingest a document
    ingest_res = doc_agent.run(
        message="Please ingest this PDF file: test/agentchat/contrib/graph_rag/research_paper.pdf",
        max_turns=1,
        llm_config=llm_config,
    )
    ingest_res.process()

    # Then query with citations
    query_res = doc_agent.run(
        message="What are the key statistics mentioned?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = query_res.process()
    return result


# ============================================================================
# 6. ERROR SCENARIOS TESTS
# ============================================================================


def test_nonexistent_file_error():
    """Test: Error handling for nonexistent files
    Expected Flow: DoclingDocIngestAgent fails, ErrorManagerAgent activated
    Expected Result: Error message: "Data Ingestion Task Failed, Error [specific error]: '/nonexistent/file.pdf'"
    """
    print("\n" + "=" * 80)
    print("TEST 11: Nonexistent File Error")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_error_handling")

    res = doc_agent.run(
        message="Ingest /nonexistent/file.pdf",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


def test_corrupt_file_error():
    """Test: Error handling for corrupt files
    Expected Flow: Parsing fails during Docling processing, graceful error handling
    Expected Result: User informed of parsing failure
    """
    print("\n" + "=" * 80)
    print("TEST 12: Corrupt File Error")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_corrupt_file")

    res = doc_agent.run(
        message="Analyze corrupt_file.pdf",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# 7. COMPLEX ANALYSIS QUERIES TESTS
# ============================================================================


def test_complex_analysis_query():
    """Test: Complex analysis across multiple documents
    Expected flow: Batch ingestion of multiple PDFs, complex RAG query processing
    Synthesis across multiple documents, comprehensive summary with citations from multiple sources
    """
    print("\n" + "=" * 80)
    print("TEST 14: Complex Analysis Query")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_citations")

    res = doc_agent.run(
        message=" who are the auther of research_paper.pdf? what is the abstract? what are the techniques used? what are the results?, what problems are solved?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# 8. ORIGINAL TEST (for reference)
# ============================================================================


def test_original_example():
    """Original test example for reference"""
    print("\n" + "=" * 80)
    print("ORIGINAL TEST: Combined Ingestion and Query")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="toast_report123")

    res = doc_agent.run(
        message="ingest test/agentchat/contrib/graph_rag/Toast_financial_report.pdf and tell me the fiscal year 2024 financial summary?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# MAIN EXECUTION
# ============================================================================


if __name__ == "__main__":
    print("Starting DocAgent Comprehensive Tests")
    print("=" * 80)

    # Run all tests
    tests = [
        test_basic_pdf_ingestion,
        test_query_without_documents,
        test_query_after_ingestion,
        test_combined_ingestion_and_query,
        test_pdf_url_ingestion_and_summary,
        test_citation_enabled_query,
        test_nonexistent_file_error,
        test_corrupt_file_error,
        test_complex_analysis_query,
        test_original_example,
    ]

    results = {}
    for test_func in tests:
        try:
            print(f"\nRunning {test_func.__name__}...")
            result = test_func()
            results[test_func.__name__] = "PASSED"
        except Exception as e:
            print(f"ERROR in {test_func.__name__}: {e}")
            results[test_func.__name__] = f"FAILED: {e}"

    # Print summary
    print("\n" + "=" * 80)
    print("TEST SUMMARY")
    print("=" * 80)
    for test_name, status in results.items():
        print(f"{test_name}: {status}")

    print(f"\nTotal tests: {len(tests)}")
    print(f"Passed: {sum(1 for status in results.values() if status == 'PASSED')}")
    print(f"Failed: {sum(1 for status in results.values() if status != 'PASSED')}")

Checks

I've included any doc changes needed for https://docs.ag2.ai/. See https://docs.ag2.ai/latest/docs/contributor-guide/documentation/ to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

priyansh4320 · 2025-09-20T17:11:12Z

@marklysze I have added Sample Test Runs in the description, As requested.
also Flexible system prompt capability as I had suggested.

marklysze · 2025-10-15T19:07:42Z

@claude can you review this PR please

claude · 2025-10-15T19:07:58Z

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

claude · 2025-10-16T07:10:45Z

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

marklysze · 2025-10-21T20:00:30Z

@claude can you review this PR please

claude · 2025-10-21T20:00:58Z

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

codecov · 2025-10-28T12:27:18Z

Codecov Report

❌ Patch coverage is 12.64368% with 228 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...agents/experimental/document_agent/task_manager.py	12.75%	171 Missing ⚠️
...ents/experimental/document_agent/document_agent.py	12.30%	57 Missing ⚠️

Files with missing lines	Coverage Δ
...ents/experimental/document_agent/document_agent.py	`26.95% <12.30%> (+5.87%)`	⬆️
...agents/experimental/document_agent/task_manager.py	`12.75% <12.75%> (ø)`

... and 20 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

priyansh4320 self-assigned this Sep 13, 2025

priyansh4320 changed the title ~~update: Simplify DocAgent Architechture~~ update: Optimise DocAgent Architechture Sep 13, 2025

priyansh4320 changed the title ~~update: Optimise DocAgent Architechture~~ update: Optimize DocAgent Architechture Sep 13, 2025

priyansh4320 changed the title ~~update: Optimize DocAgent Architechture~~ update: Optimize DocAgent Architecture Sep 13, 2025

priyansh4320 marked this pull request as ready for review September 13, 2025 08:57

priyansh4320 marked this pull request as draft September 13, 2025 11:35

priyansh4320 marked this pull request as ready for review September 13, 2025 18:11

priyansh4320 requested a review from qingyun-wu September 13, 2025 18:17

priyansh4320 force-pushed the simplify-docagent branch from e6a02b4 to d646be4 Compare September 13, 2025 18:31

priyansh4320 added the enhancement New feature or request label Sep 13, 2025

priyansh4320 marked this pull request as draft September 15, 2025 17:26

priyansh4320 force-pushed the simplify-docagent branch from 12f6b06 to d646be4 Compare September 17, 2025 22:12

priyansh4320 mentioned this pull request Sep 17, 2025

feat: [DocAgent] Add Dynamic RAG #2105

Draft

3 tasks

priyansh4320 marked this pull request as ready for review September 18, 2025 17:50

priyansh4320 requested a review from marklysze September 18, 2025 20:06

priyansh4320 force-pushed the simplify-docagent branch 2 times, most recently from 26362dd to cab7dd4 Compare September 20, 2025 02:44

priyansh4320 added the review required label Sep 20, 2025

priyansh4320 force-pushed the simplify-docagent branch 2 times, most recently from 34cb54a to 3f90d1c Compare September 25, 2025 23:47

priyansh4320 and others added 9 commits October 4, 2025 04:52

update: Simplify DocAgent Architechture

f74e0d6

feat: achieve expected behavior with async tools

c0ddc1e

feat: add concurrent Threadpool

348b613

Test: DocAgent and TaskManager

9f9cc76

feat: add citation support

bb555ef

feat:[inner agents] Flexible System Prompts

ca49f9d

fix: mypy

68ac0e6

Feat: Update DocAgent Documentation

48a9c66

fix: deps tree [rag,dev]

ab45f11

pre-commit

c463090

priyansh4320 force-pushed the simplify-docagent branch from 3f90d1c to c463090 Compare October 3, 2025 23:23

fix: conftest mypy

9a7cd4d

priyansh4320 linked an issue Oct 16, 2025 that may be closed by this pull request

[Issue]: optimize existing DocAgent Architecture #2098

Open

This was referenced Oct 16, 2025

[Bug]: Claude review wont't work on forked branches #2148

Closed

fix: claude code review for forked branches #2149

Merged

Merge branch 'main' into simplify-docagent

eebd7be

Merge branch 'main' into simplify-docagent

c0a9512

priyansh4320 marked this pull request as draft October 22, 2025 17:17

priyansh4320 and others added 3 commits October 23, 2025 00:30

fix: memory cleanups

b23e83f

fixed: update path validation, improve fallback logic

def396b

Merge branch 'main' into simplify-docagent

d04457f

priyansh4320 linked an issue Oct 24, 2025 that may be closed by this pull request

[Feature Request]: DocAgent: Improve Detail Retention Capabilities #2021

Open

Merge branch 'main' into simplify-docagent

2f23b13

This was linked to issues Oct 28, 2025

[Bug]: Docagent PDF ingestion fails - character issue #1731

Open

[Issue]: unable to get DocAgent working with azure openai endpoints #2174

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

update: Optimize DocAgent Architecture #2097

update: Optimize DocAgent Architecture #2097

Uh oh!

priyansh4320 commented Sep 12, 2025 •

edited

Loading

Uh oh!

priyansh4320 commented Sep 20, 2025

Uh oh!

marklysze commented Oct 15, 2025

Uh oh!

claude bot commented Oct 15, 2025 •

edited

Loading

Uh oh!

claude bot commented Oct 16, 2025 •

edited

Loading

Uh oh!

marklysze commented Oct 21, 2025

Uh oh!

claude bot commented Oct 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

update: Optimize DocAgent Architecture #2097

Are you sure you want to change the base?

update: Optimize DocAgent Architecture #2097

Uh oh!

Conversation

priyansh4320 commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Old Architecture

New Architecture

Related issue number

Sample Runs

Checks

Uh oh!

priyansh4320 commented Sep 20, 2025

Uh oh!

marklysze commented Oct 15, 2025

Uh oh!

claude bot commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marklysze commented Oct 21, 2025

Uh oh!

claude bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 28, 2025

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

priyansh4320 commented Sep 12, 2025 •

edited

Loading

claude bot commented Oct 15, 2025 •

edited

Loading

claude bot commented Oct 16, 2025 •

edited

Loading

claude bot commented Oct 21, 2025 •

edited

Loading