Skip to content

Conversation

@priyansh4320
Copy link
Collaborator

@priyansh4320 priyansh4320 commented Sep 12, 2025

Why are these changes needed?

This PR Simplifies DocAgent Architecture.

  • Added ThreadPoolExecutor to the tool.
  • Achieve optimization on DocAgent.
  • High Speed Query Responses.
  • Concurrent Document Ingestions.
  • Added pseudo supervisor.
  • Added Scalability.
  • Added Citation Support.
  • Added Flexible Prompt Capability for Inner Agent.

Simplification of DocAgent Architecture for the upcoming PR.

Old Architecture

Untitled diagram _ Mermaid Chart-2025-09-13-183817

New Architecture

Untitled diagram _ Mermaid Chart-2025-09-12-220128

Related issue number

#2098

Sample Runs

import os

from dotenv import load_dotenv

from autogen.agents.experimental.document_agent import DocAgent
from autogen.llm_config import LLMConfig

load_dotenv()

llm_config = LLMConfig({
    "api_type": "openai",
    "model": "gpt-5-nano",
    "api_key": os.getenv("OPENAI_API_KEY"),
})

# ============================================================================
# 1. BASIC DOCUMENT INGESTION TESTS
# ============================================================================


def test_basic_pdf_ingestion():
    """Test: Basic PDF document ingestion
    Expected Flow: TriageAgent → TaskManager → DoclingDocIngestAgent
    Expected Result: Document parsed with Docling to markdown, stored in vector database
    Summary shows: "Ingestions: 1. document.pdf"
    """
    print("\n" + "=" * 80)
    print("TEST 1: Basic PDF Document Ingestion")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_basic_ingestion")

    res = doc_agent.run(
        message="Please ingest this PDF file: test/agentchat/contrib/graph_rag/Toast_financial_report.pdf",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# 2. BASIC QUERY-ONLY REQUESTS TESTS
# ============================================================================


def test_query_without_documents():
    """Test: Query without any ingested documents
    Expected Flow: TriageAgent identifies no ingestions needed → TaskManager → QueryAgent
    Expected Result: Empty response: "Sorry, please ingest some documents/URLs before querying"
    """
    print("\n" + "=" * 80)
    print("TEST 3: Query Without Documents")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_query_only")

    res = doc_agent.run(
        message="what is the fiscal year 2024 financial summary?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


def test_query_after_ingestion():
    """Test: Query after documents have been ingested
    Expected Flow: TaskManager → QueryAgent → execute_rag_query
    Expected Result: Vector database queried, answer with citations returned
    Summary shows query and detailed answer
    """
    print("\n" + "=" * 80)
    print("TEST 4: Query After Document Ingestion")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_query_after_ingestion")

    # First ingest a document
    ingest_res = doc_agent.run(
        message="Please ingest this PDF file: test/agentchat/contrib/graph_rag/Toast_financial_report.pdf",
        max_turns=1,
        llm_config=llm_config,
    )
    ingest_res.process()

    # Then query it
    query_res = doc_agent.run(
        message="What are the key findings?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = query_res.process()
    return result


# ============================================================================
# 3. COMBINED INGESTION + QUERY REQUESTS TESTS
# ============================================================================


def test_combined_ingestion_and_query():
    """Test: Combined ingestion and query in single request
    Expected Flow: TriageAgent creates DocumentTask with ingestion + query
    TaskManager → DoclingDocIngestAgent (PDF processing) → QueryAgent (conclusion answers)
    Expected Result: Summary shows both ingestion and query results with citations
    """
    print("\n" + "=" * 80)
    print("TEST 5: Combined Ingestion and Query")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_combined")

    res = doc_agent.run(
        message="Please analyze this report test/agentchat/contrib/graph_rag/Toast_financial_report.pdf and tell me the main conclusions",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# 4. URL INGESTION QUERIES TESTS
# ============================================================================


def test_pdf_url_ingestion_and_summary():
    """Test: Ingest PDF from URL and summarize
    Expected Flow: PDF downloaded directly, parsed with Docling, query answered from ingested content
    Expected Result: Summary of abstract returned
    """
    print("\n" + "=" * 80)
    print("TEST 8: PDF URL Ingestion and Summary")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_pdf_url")

    res = doc_agent.run(
        message="Ingest https://arxiv.org/pdf/1706.03762 and summarize the abstract",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result



# ============================================================================
# 5. CITATION-ENABLED QUERIES TESTS
# ============================================================================


def test_citation_enabled_query():
    """Test: Query with citations enabled
    Expected Flow: Query processed with citations, answer includes source chunks and file paths
    Expected Result: Summary formats citations as: "Source [X] (chunk file_path):\n\nChunk X:\n(text_chunk)"
    """
    print("\n" + "=" * 80)
    print("TEST 10: Citation-Enabled Query")
    print("=" * 80)

    # Create DocAgent with citation support enabled
    doc_agent = DocAgent(
        llm_config=llm_config,
        collection_name="test_citations",
        enable_citations=True,  # Enable citations
        citation_chunk_size=512,  # Optional: customize citation chunk size
    )

    # Rest of your test remains the same...
    # First ingest a document
    ingest_res = doc_agent.run(
        message="Please ingest this PDF file: test/agentchat/contrib/graph_rag/research_paper.pdf",
        max_turns=1,
        llm_config=llm_config,
    )
    ingest_res.process()

    # Then query with citations
    query_res = doc_agent.run(
        message="What are the key statistics mentioned?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = query_res.process()
    return result


# ============================================================================
# 6. ERROR SCENARIOS TESTS
# ============================================================================


def test_nonexistent_file_error():
    """Test: Error handling for nonexistent files
    Expected Flow: DoclingDocIngestAgent fails, ErrorManagerAgent activated
    Expected Result: Error message: "Data Ingestion Task Failed, Error [specific error]: '/nonexistent/file.pdf'"
    """
    print("\n" + "=" * 80)
    print("TEST 11: Nonexistent File Error")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_error_handling")

    res = doc_agent.run(
        message="Ingest /nonexistent/file.pdf",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


def test_corrupt_file_error():
    """Test: Error handling for corrupt files
    Expected Flow: Parsing fails during Docling processing, graceful error handling
    Expected Result: User informed of parsing failure
    """
    print("\n" + "=" * 80)
    print("TEST 12: Corrupt File Error")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_corrupt_file")

    res = doc_agent.run(
        message="Analyze corrupt_file.pdf",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# 7. COMPLEX ANALYSIS QUERIES TESTS
# ============================================================================


def test_complex_analysis_query():
    """Test: Complex analysis across multiple documents
    Expected flow: Batch ingestion of multiple PDFs, complex RAG query processing
    Synthesis across multiple documents, comprehensive summary with citations from multiple sources
    """
    print("\n" + "=" * 80)
    print("TEST 14: Complex Analysis Query")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="test_citations")

    res = doc_agent.run(
        message=" who are the auther of research_paper.pdf? what is the abstract? what are the techniques used? what are the results?, what problems are solved?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# 8. ORIGINAL TEST (for reference)
# ============================================================================


def test_original_example():
    """Original test example for reference"""
    print("\n" + "=" * 80)
    print("ORIGINAL TEST: Combined Ingestion and Query")
    print("=" * 80)

    doc_agent = DocAgent(llm_config=llm_config, collection_name="toast_report123")

    res = doc_agent.run(
        message="ingest test/agentchat/contrib/graph_rag/Toast_financial_report.pdf and tell me the fiscal year 2024 financial summary?",
        max_turns=1,
        llm_config=llm_config,
    )

    result = res.process()
    return result


# ============================================================================
# MAIN EXECUTION
# ============================================================================


if __name__ == "__main__":
    print("Starting DocAgent Comprehensive Tests")
    print("=" * 80)

    # Run all tests
    tests = [
        test_basic_pdf_ingestion,
        test_query_without_documents,
        test_query_after_ingestion,
        test_combined_ingestion_and_query,
        test_pdf_url_ingestion_and_summary,
        test_citation_enabled_query,
        test_nonexistent_file_error,
        test_corrupt_file_error,
        test_complex_analysis_query,
        test_original_example,
    ]

    results = {}
    for test_func in tests:
        try:
            print(f"\nRunning {test_func.__name__}...")
            result = test_func()
            results[test_func.__name__] = "PASSED"
        except Exception as e:
            print(f"ERROR in {test_func.__name__}: {e}")
            results[test_func.__name__] = f"FAILED: {e}"

    # Print summary
    print("\n" + "=" * 80)
    print("TEST SUMMARY")
    print("=" * 80)
    for test_name, status in results.items():
        print(f"{test_name}: {status}")

    print(f"\nTotal tests: {len(tests)}")
    print(f"Passed: {sum(1 for status in results.values() if status == 'PASSED')}")
    print(f"Failed: {sum(1 for status in results.values() if status != 'PASSED')}")

Checks

@priyansh4320 priyansh4320 self-assigned this Sep 13, 2025
@priyansh4320 priyansh4320 changed the title update: Simplify DocAgent Architechture update: Optimise DocAgent Architechture Sep 13, 2025
@priyansh4320 priyansh4320 changed the title update: Optimise DocAgent Architechture update: Optimize DocAgent Architechture Sep 13, 2025
@priyansh4320 priyansh4320 changed the title update: Optimize DocAgent Architechture update: Optimize DocAgent Architecture Sep 13, 2025
@priyansh4320 priyansh4320 marked this pull request as ready for review September 13, 2025 08:57
@priyansh4320 priyansh4320 marked this pull request as draft September 13, 2025 11:35
@priyansh4320 priyansh4320 marked this pull request as ready for review September 13, 2025 18:11
@priyansh4320 priyansh4320 added the enhancement New feature or request label Sep 13, 2025
@priyansh4320 priyansh4320 marked this pull request as draft September 15, 2025 17:26
@priyansh4320 priyansh4320 marked this pull request as ready for review September 18, 2025 17:50
@priyansh4320 priyansh4320 force-pushed the simplify-docagent branch 2 times, most recently from 26362dd to cab7dd4 Compare September 20, 2025 02:44
@priyansh4320
Copy link
Collaborator Author

@marklysze I have added Sample Test Runs in the description, As requested.
also Flexible system prompt capability as I had suggested.

@priyansh4320 priyansh4320 force-pushed the simplify-docagent branch 2 times, most recently from 34cb54a to 3f90d1c Compare September 25, 2025 23:47
@marklysze
Copy link
Collaborator

@claude can you review this PR please

@claude
Copy link

claude bot commented Oct 15, 2025

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

1 similar comment
@claude
Copy link

claude bot commented Oct 16, 2025

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

@marklysze
Copy link
Collaborator

@claude can you review this PR please

@claude
Copy link

claude bot commented Oct 21, 2025

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

@priyansh4320 priyansh4320 marked this pull request as draft October 22, 2025 17:17
@codecov
Copy link

codecov bot commented Oct 28, 2025

Codecov Report

❌ Patch coverage is 12.64368% with 228 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...agents/experimental/document_agent/task_manager.py 12.75% 171 Missing ⚠️
...ents/experimental/document_agent/document_agent.py 12.30% 57 Missing ⚠️
Files with missing lines Coverage Δ
...ents/experimental/document_agent/document_agent.py 26.95% <12.30%> (+5.87%) ⬆️
...agents/experimental/document_agent/task_manager.py 12.75% <12.75%> (ø)

... and 20 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request review required

Projects

None yet

2 participants