refactor(rag): remove unused legacy RAG code after refactoring #830

kissghosts · 2026-01-05T03:17:09Z

Remove deprecated RAG code that is no longer used after the chat RAG refactoring:

Delete process_rag_if_needed and extract_knowledge_base_ids from processor.py
Delete entire rag_integration.py file (retrieve_and_assemble_rag_prompt)
Update module exports to remove deleted functions

RAG retrieval is now handled dynamically by KnowledgeBaseTool in preprocessing/contexts.py, making these legacy functions obsolete.

Summary by CodeRabbit

Refactor
- Reorganized RAG (Retrieval Augmented Generation) functionality to be handled dynamically by the KnowledgeBaseTool component, improving system modularity and separation of concerns.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Remove deprecated RAG code that is no longer used after the chat RAG refactoring: - Delete `process_rag_if_needed` and `extract_knowledge_base_ids` from processor.py - Delete entire `rag_integration.py` file (retrieve_and_assemble_rag_prompt) - Update module exports to remove deleted functions RAG retrieval is now handled dynamically by KnowledgeBaseTool in preprocessing/contexts.py, making these legacy functions obsolete.

coderabbitai · 2026-01-05T03:17:19Z

📝 Walkthrough

Walkthrough

The pull request consolidates RAG processing by removing direct retrieval functions from the chat shell tools and restructuring the RAG service to delegate actual retrieval to KnowledgeBaseTool. Functions handling knowledge base retrieval and ID extraction are deleted or replaced with metadata-only extraction.

Changes

Cohort / File(s)	Summary
Chat Shell Tools Cleanup `backend/app/chat_shell/tools/__init__.py`, `backend/app/chat_shell/tools/rag_integration.py`	Removed `retrieve_and_assemble_rag_prompt` from public exports in `__init__.py` and deleted the entire `rag_integration.py` module, eliminating direct RAG retrieval logic that included KB validation, node retrieval, deduplication, context assembly, token limiting, and prompt generation.
RAG Service Restructuring `backend/app/services/chat/rag/__init__.py`, `backend/app/services/chat/rag/processor.py`	Removed `process_rag_if_needed` and `extract_knowledge_base_ids` functions; updated `process_context_and_rag` to return only metadata for storage when tool-based RAG is enabled, delegating actual retrieval to KnowledgeBaseTool. Updated module docstring to reflect dynamic RAG handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Chat knowledge integration #649: Adds a similarly named retrieve_and_assemble_rag_prompt implementation in services/chat/rag_integration.py, directly related to the removal of the same function from the chat shell tools in this PR.

Poem

🐰 Behold the RAG refactor dance,
Where tools release their RAG advance,
To KnowledgeBaseTool's eager care—
Metadata floats; retrieval's there!
Less scattered, more intent, hooray! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: removing legacy RAG code after refactoring to use KnowledgeBaseTool instead.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/app/services/chat/rag/processor.py (1)

29-47: Update the docstring to match the actual implementation.

The docstring mentions an enable_deep_thinking parameter that doesn't exist in the function signature and describes behavior ("Performs full RAG retrieval and prompt assembly") that is no longer implemented. The current implementation only extracts context metadata when contexts and should_trigger_ai are both truthy, and always returns None for the rag_prompt.

🔎 Proposed fix for the docstring

     """
-    Process context metadata and RAG based on chat version.
+    Extract context metadata for tool-based RAG.
 
-    This function handles RAG processing differently based on enable_deep_thinking:
-    - enable_deep_thinking=True: Only extracts context metadata for tool-based RAG
-    - enable_deep_thinking=False: Performs full RAG retrieval and prompt assembly
-
-    For tool-enabled mode, KnowledgeBaseTool will handle retrieval dynamically.
+    When contexts are provided and AI should be triggered, this function extracts
+    context metadata for storage. KnowledgeBaseTool handles actual RAG retrieval
+    dynamically during tool execution.
 
     Args:
         message: Original user message
         contexts: List of context objects
-        should_trigger_ai: Whether AI should be triggered
+        should_trigger_ai: Whether AI should be triggered (tool-enabled mode)
         user_id: User ID
         db: Database session
 
     Returns:
-        Tuple of (context_metadata dict, rag_prompt string or None)
+        Tuple of (context_metadata dict or None, None). The rag_prompt is always None
+        as RAG retrieval is delegated to KnowledgeBaseTool.
     """

🧹 Nitpick comments (1)

backend/app/services/chat/rag/processor.py (1)

22-28: Remove unused parameters user_id and db from the function signature and call site.

The function body never references these parameters. Update both the function definition in backend/app/services/chat/rag/processor.py and the call site in backend/app/api/ws/chat_namespace.py:448 to remove them.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bb52e6 and 3444a82.

📒 Files selected for processing (4)

backend/app/chat_shell/tools/__init__.py
backend/app/chat_shell/tools/rag_integration.py
backend/app/services/chat/rag/__init__.py
backend/app/services/chat/rag/processor.py

💤 Files with no reviewable changes (2)

backend/app/chat_shell/tools/init.py
backend/app/chat_shell/tools/rag_integration.py

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{py,ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{py,ts,tsx,js,jsx}: All code comments MUST be written in English
File size MUST NOT exceed 1000 lines - split into multiple sub-modules if exceeded
Function length SHOULD NOT exceed 50 lines (preferred)

Files:

backend/app/services/chat/rag/__init__.py
backend/app/services/chat/rag/processor.py

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code MUST follow PEP 8, use Black formatter with line length 88, and isort for imports
Python code MUST include type hints
Python functions and classes MUST have descriptive names and docstrings for public functions/classes
Python MUST extract magic numbers to named constants

Files:

backend/app/services/chat/rag/__init__.py
backend/app/services/chat/rag/processor.py

backend/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

backend/**/*.py: Python backend module imports MUST use uv run prefix when executing commands
Task and Workspace resources MUST use TaskResource model from app.models.task, not the Kind model
Ghost, Model, Shell, Bot, Team, and Skill CRDs MUST use Kind model from app.models.kind

Files:

backend/app/services/chat/rag/__init__.py
backend/app/services/chat/rag/processor.py

🧬 Code graph analysis (1)

backend/app/services/chat/rag/__init__.py (1)

backend/app/services/chat/rag/processor.py (1)

process_context_and_rag (22-65)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: E2E Tests (Shard 1/3)
GitHub Check: E2E Tests (Shard 2/3)
GitHub Check: E2E Tests (Shard 3/3)
GitHub Check: Test Frontend
GitHub Check: Test wegent CLI Integration

🔇 Additional comments (1)

backend/app/services/chat/rag/__init__.py (1)

1-18: LGTM! Public API correctly narrowed.

The module now correctly exposes only process_context_and_rag, and the docstring accurately reflects that RAG retrieval is delegated to KnowledgeBaseTool while this module handles metadata extraction.

coderabbitai · 2026-01-05T03:20:19Z

backend/app/services/chat/rag/processor.py

+
+Note: RAG retrieval is now handled dynamically by KnowledgeBaseTool.
+This module only extracts context metadata for storage.


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Search for any remaining references to deleted RAG functions # Search for process_rag_if_needed references echo "=== Checking for process_rag_if_needed references ===" rg -n 'process_rag_if_needed' --type py echo -e "\n=== Checking for extract_knowledge_base_ids references ===" rg -n 'extract_knowledge_base_ids' --type py echo -e "\n=== Checking for retrieve_and_assemble_rag_prompt references ===" rg -n 'retrieve_and_assemble_rag_prompt' --type py

Repository: wecode-ai/Wegent

Length of output: 503

🏁 Script executed:

#!/bin/bash # Check the function definition in contexts.py echo "=== extract_knowledge_base_ids definition ===" sed -n '185,210p' backend/app/services/chat/preprocessing/contexts.py # Check if extract_knowledge_base_ids is used anywhere (imported/called) echo -e "\n=== All usages of extract_knowledge_base_ids (excluding definition) ===" rg 'extract_knowledge_base_ids' --type py -B 2 -A 2

Repository: wecode-ai/Wegent

Length of output: 1943

🏁 Script executed:

#!/bin/bash # Search for actual function calls to extract_knowledge_base_ids (excluding imports and definition) echo "=== Checking for calls to extract_knowledge_base_ids ===" rg 'extract_knowledge_base_ids\(' --type py # Also check if process_rag_if_needed and retrieve_and_assemble_rag_prompt exist as definitions echo -e "\n=== Checking for process_rag_if_needed definition ===" rg 'def process_rag_if_needed' --type py echo -e "\n=== Checking for retrieve_and_assemble_rag_prompt definition ===" rg 'def retrieve_and_assemble_rag_prompt' --type py

Repository: wecode-ai/Wegent

Length of output: 321

Remove unused extract_knowledge_base_ids function.

The functions process_rag_if_needed and retrieve_and_assemble_rag_prompt have been successfully deleted with no remaining references. However, extract_knowledge_base_ids still exists in backend/app/services/chat/preprocessing/contexts.py and is exported from the module's __init__.py, but it is not called anywhere in the codebase. Remove this unused function to complete the cleanup.

🤖 Prompt for AI Agents

In backend/app/services/chat/rag/processor.py around lines 9 to 11, remove the now-unused extract_knowledge_base_ids function: delete its entire definition from this file, remove any export of it from the module's __init__.py (so it is no longer re-exported), and ensure there are no remaining imports or references elsewhere (run a quick grep/IDE search and update/remove any that appear). After removal, run linters/tests to confirm no unresolved references remain.

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(rag): remove unused legacy RAG code after refactoring #830

refactor(rag): remove unused legacy RAG code after refactoring #830

Uh oh!

kissghosts commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 5, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		Note: RAG retrieval is now handled dynamically by KnowledgeBaseTool.
		This module only extracts context metadata for storage.

refactor(rag): remove unused legacy RAG code after refactoring #830

Are you sure you want to change the base?

refactor(rag): remove unused legacy RAG code after refactoring #830

Uh oh!

Conversation

kissghosts commented Jan 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kissghosts commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 5, 2026 •

edited

Loading