-
Notifications
You must be signed in to change notification settings - Fork 67
feat(backend): implement task-level context aggregation for RAG #786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Refactored subtask_attachments table to subtask_contexts table for unified context management. This enables storing multiple context types (attachments, knowledge bases, etc.) with a flexible schema. Key changes: - New SubtaskContext model replacing SubtaskAttachment - New ContextService for unified context operations - Updated attachments API to use ContextService internally - Added contexts field to WebSocket events and frontend types - New ContextBadgeList component for displaying all context types - Database migration with data migration from old table
- Add execute_rag_retrieval_for_contexts() to execute RAG retrieval when creating knowledge_base contexts and store results in SubtaskContext.extracted_text with sources in type_data - Add context_service methods: - update_knowledge_base_retrieval_result(): Store RAG results - mark_knowledge_base_context_failed(): Handle retrieval failures - build_knowledge_base_text_prefix(): Format KB content for messages - get_knowledge_base_contexts_by_subtask(): Get KB contexts - get_knowledge_base_meta_for_task(): Collect unique KBs from task - Modify _build_history_message() to load both attachment and knowledge_base contexts, with attachments having priority for token allocation (MAX_EXTRACTED_TEXT_LENGTH shared limit) - Add get_knowledge_base_meta_prompt() to generate KB meta info prompt for system prompt injection - Update prepare_knowledge_base_tools() to accept task_id and inject historical KB meta info into system prompt This enables: 1. First message executes RAG retrieval and persists results 2. Follow-up messages load RAG results from history 3. Agent receives KB meta info to use KnowledgeBaseTool for additional retrieval if needed
📝 WalkthroughWalkthroughThe PR unifies handling of subtask attachments and knowledge bases by introducing a new SubtaskContext model and ContextService that replace the attachment-only SubtaskAttachment approach. It includes database migration, updated ORM models, refactored services across chat preprocessing/triggering/storage, new unified schemas, and frontend components that display and manage contexts alongside messages. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User (Frontend)
participant Chat as Chat API<br/>(Endpoint)
participant Context as ContextService
participant Store as Storage<br/>(MySQL)
participant DB as Database
User->>Chat: upload_attachment(file)
Chat->>Context: upload_attachment(db, user_id, filename, binary_data, subtask_id)
Context->>Context: validate file extension/size
Context->>Store: save(context_id, binary_data)
Store->>DB: update context storage metadata
Store-->>Context: storage_key, storage_backend
Context->>DB: create SubtaskContext (status=UPLOADING)
Context->>Context: parse document (text/image extraction)
Context->>DB: update extracted_text, image_base64, status=PARSING
Context->>DB: update status=READY (or FAILED on error)
Context-->>Chat: (SubtaskContext, TruncationInfo)
Chat-->>User: AttachmentResponse (via from_context)
sequenceDiagram
participant User as User (Frontend)
participant WS as WebSocket<br/>(chat_namespace)
participant Preprocess as Chat Preprocessing
participant Context as ContextService
participant Trigger as Chat Trigger
participant AI as AI Service
User->>WS: chat:send (message, attachment_ids, context_ids)
WS->>Preprocess: prepare_contexts_for_chat(user_subtask_id, message)
Preprocess->>Context: get_by_subtask(subtask_id)
Context->>Context: filter by context_type
Context-->>Preprocess: [SubtaskContext, ...]
Preprocess->>Preprocess: separate attachment vs KB contexts
Preprocess->>Preprocess: build vision/text blocks for attachments
Preprocess->>Preprocess: prepare KB tool instances (with user_subtask_id)
Preprocess-->>WS: (final_message, enhanced_system_prompt, extra_tools)
WS->>Trigger: trigger_ai_response(..., user_subtask_id)
Trigger->>Trigger: create ChatAgent with extra_tools
Trigger->>AI: stream chat (with KB tool, vision blocks)
AI->>AI: invoke KB tool for RAG retrieval
AI->>Context: (implicit via KB tool) persist RAG results
Context->>DB: update KB context (extracted_text, sources)
AI-->>Trigger: response_stream
Trigger-->>WS: chat:message, chat:done
WS-->>User: render message + contexts badges
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes The changes introduce a significant architectural shift from attachment-centric to unified context handling across multiple layers (database, ORM, services, endpoints, frontend). The review requires understanding the new SubtaskContext model, comprehensive ContextService, context linking/processing pipelines, and widespread integration points across chat preprocessing, triggering, storage, and UI components. Heterogeneous changes across the codebase (not simple repetitive patterns) and dense logic in several files (context_service.py, contexts.py, chat trigger logic) increase complexity. Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
bd891dc to
b3b4b31
Compare
This change implements task-level context aggregation, enabling global knowledge base visibility across all subtasks in a task. This is particularly useful for group chat scenarios where new members can access task-level knowledge bases without individual permissions. Key changes: 1. Task-level context storage: - Add contexts.subtask_contexts field to Task JSON structure - Store context IDs with type (knowledge_base/attachment) for filtering 2. Incremental sync mechanism: - Create task_contexts.py service module for context aggregation - Sync contexts to Task when linking to subtasks - Implement get_kb_contexts_from_task for efficient retrieval 3. Priority-based context resolution: - Current subtask contexts > Task-level historical contexts - Fallback to Task contexts when subtask has no knowledge bases 4. Enhanced KnowledgeBaseTool: - Add dynamic description with available knowledge bases list - Include KB metadata (ID + Name) in tool description and prompt 5. WebSocket integration: - Pass task_id to link_contexts_to_subtask for sync Benefits: - Task-level knowledge base sharing in group chats - No duplicate knowledge base references - Efficient JSON-based filtering (no extra DB queries) - Backward compatible with existing subtask contexts
1107304 to
609816c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/api/endpoints/adapter/attachments.py (1)
304-355: Critical:executor_download_attachmentreferences undefinedattachment_service.This endpoint still uses
attachment_service(lines 321, 331-333, 339-340, 347-348, 351), but the import was removed during the migration tocontext_service. This will cause aNameErrorat runtime when this endpoint is called.The endpoint needs to be migrated to use
context_servicelike the other endpoints, orattachment_serviceneeds to be imported.🔎 Proposed fix: Migrate to context_service
@router.get("/{attachment_id}/executor-download") async def executor_download_attachment( attachment_id: int, db: Session = Depends(get_db), current_user: User = Depends(security.get_current_user), ): """ Download attachment for executor. ... """ # Get attachment and verify ownership - attachment = attachment_service.get_attachment( + context = context_service.get_context_optional( db=db, - attachment_id=attachment_id, + context_id=attachment_id, user_id=current_user.id, ) - if attachment is None: + if context is None: + raise HTTPException(status_code=404, detail="Attachment not found") + + # Verify it's an attachment type + if context.context_type != ContextType.ATTACHMENT.value: raise HTTPException(status_code=404, detail="Attachment not found") # Get binary data from the appropriate storage backend - binary_data = attachment_service.get_attachment_binary_data( + binary_data = context_service.get_attachment_binary_data( db=db, - attachment=attachment, + context=context, ) if binary_data is None: logger.error( f"Failed to retrieve binary data for attachment {attachment_id}, " - f"storage_backend={attachment.storage_backend}, " - f"storage_key={attachment.storage_key}" + f"storage_backend={context.storage_backend}, " + f"storage_key={context.storage_key}" ) raise HTTPException( status_code=500, detail="Failed to retrieve attachment data" ) # Encode filename for Content-Disposition header - encoded_filename = quote(attachment.original_filename) + encoded_filename = quote(context.original_filename) return Response( content=binary_data, - media_type=attachment.mime_type, + media_type=context.mime_type, headers={ "Content-Disposition": f"attachment; filename*=UTF-8''{encoded_filename}" }, )
🧹 Nitpick comments (20)
backend/alembic/versions/o5p6q7r8s9t0_add_subtask_contexts_table.py (1)
82-120: Refactor SQL construction to use parameters instead of f-strings.While the current code is safe (the f-string interpolates a code-controlled variable, not user input), using f-strings with
sa.text()is a code smell that static analyzers rightfully flag. Consider refactoring to bind parameters or use SQLAlchemy's expression language for better maintainability.🔎 Alternative approach using conditional execution
- # Build the updated_at expression based on column existence - updated_at_expr = ( - "COALESCE(updated_at, created_at)" if has_updated_at else "created_at" - ) - - # Migrate data from subtask_attachments to subtask_contexts - connection.execute( - sa.text( - f""" + # Migrate data with conditional updated_at handling + if has_updated_at: + migration_sql = """ INSERT INTO subtask_contexts ( subtask_id, user_id, context_type, name, status, error_message, binary_data, image_base64, extracted_text, text_length, type_data, created_at, updated_at ) SELECT subtask_id, user_id, 'attachment' AS context_type, original_filename AS name, status, error_message, binary_data, image_base64, extracted_text, COALESCE(text_length, 0) AS text_length, JSON_OBJECT( 'original_filename', original_filename, 'file_extension', file_extension, 'file_size', file_size, 'mime_type', mime_type, 'storage_key', COALESCE(storage_key, ''), 'storage_backend', COALESCE(storage_backend, 'mysql') ) AS type_data, created_at, - {updated_at_expr} AS updated_at + COALESCE(updated_at, created_at) AS updated_at FROM subtask_attachments - """ + """ + else: + migration_sql = """ + INSERT INTO subtask_contexts ( + subtask_id, user_id, context_type, name, status, error_message, + binary_data, image_base64, extracted_text, text_length, + type_data, created_at, updated_at + ) + SELECT + subtask_id, + user_id, + 'attachment' AS context_type, + original_filename AS name, + status, + error_message, + binary_data, + image_base64, + extracted_text, + COALESCE(text_length, 0) AS text_length, + JSON_OBJECT( + 'original_filename', original_filename, + 'file_extension', file_extension, + 'file_size', file_size, + 'mime_type', mime_type, + 'storage_key', COALESCE(storage_key, ''), + 'storage_backend', COALESCE(storage_backend, 'mysql') + ) AS type_data, + created_at, + created_at AS updated_at + FROM subtask_attachments + """ + + connection.execute(sa.text(migration_sql)) - ) - )frontend/src/features/tasks/components/message/MessagesArea.tsx (1)
86-102: Streaming bubble now passes RAG sources correctlyForwarding
message.sourcesintomsgForBubblekeeps the streaming path aligned with the non‑streaming one for RAG citations; this looks correct. If you also want context badges to render while streaming, consider additionally passingcontexts: message.contextshere for consistency withconvertToMessage.backend/app/services/rag/document_service.py (1)
42-104: Attachment binary fetch refactor toSubtaskContextlooks correct, minor naming nitThe switch to querying
SubtaskContextwithContextType.ATTACHMENTandContextStatus.READYplus delegating binary retrieval tocontext_service.get_attachment_binary_datais sound, and the error handling/logging cover the important failure modes. The only minor nit is that the parameter and surrounding docstrings still talk aboutattachment_idwhile it’s now effectively a context ID for an attachment‑type context; consider renaming to avoid confusion in future refactors.backend/app/services/adapters/task_kinds.py (1)
1075-1145: Subtaskcontextsassembly matches unified context brief shapeBuilding
contexts_listwith base fields plus attachment- and knowledge‑base–specific fields (extension/size/mime_type vsdocument_count) aligns with theSubtaskContextBriefschema and keepsattachmentsas an empty legacy field for backward compatibility. You might consider reusing the existingSubtaskContextBrief.from_modelhelper instead of duplicating the mapping logic here to avoid drift if the brief schema evolves.backend/app/services/attachment/mysql_storage.py (1)
48-66: MySQL storage now correctly targetsSubtaskContext, but key format docs are inconsistentUsing
SubtaskContextand updatingbinary_data,type_data.storage_backend, andtype_data.storage_keyis consistent with the new context model, and_extract_attachment_idcorrectly derives the numeric ID from the final underscore‑separated segment. However, the docstrings forsave/get/delete/existssay the key format isattachments/{context_id}, while_extract_attachment_id(and its docstring) expectattachments/{uuid}_{timestamp}_{user_id}_{context_id}. It would be good to unify these comments to reflect the actual, supported key format and avoid confusion for future maintainers.Also applies to: 97-103, 124-133, 159-165, 202-231
backend/app/chat_shell/history/loader.py (2)
86-103: Unified context loading into history is logically sound, with small refactor opportunityLoading all
SubtaskContextrows for a user subtask and then prioritizing attachment vision/text prefixes before fitting knowledge‑base prefixes intoMAX_EXTRACTED_TEXT_LENGTHis a reasonable strategy and keeps behavior explicit. One minor clean‑up:_build_history_messagetakes acontext_serviceparameter but immediately re‑importscontext_serviceinside the function, so the argument is effectively unused; you can either drop the parameter and rely on the import, or remove the inner import and use the injected service to simplify the call site.Also applies to: 121-217
244-285:get_knowledge_base_meta_prompthelper is clear; consider trimming formattingThe meta‑prompt builder correctly reuses
context_service.get_knowledge_base_meta_for_taskand formats a concise KB list. If this string is inserted directly into system prompts, you might want tostrip()or avoid the leading/trailing blank lines introduced by the triple‑quoted literal to keep prompt formatting tight, but that’s cosmetic.frontend/src/features/tasks/components/chat/useChatStreamHandlers.tsx (1)
115-136: Context payload and pending context wiring look correctMapping
selectedContextsinto WebSocketcontexts(withknowledge_id,name, and optionaldocument_countforknowledge_base) and buildingpendingContextsin aSubtaskContextBrief‑compatible shape for immediate display are both aligned with the new unified context model. InvokingresetContexts?.()alongsideresetAttachment()on send is also a good integration point. As a small polish, you could reuse the exportedSubtaskContextBrieftype forpendingContextsinstead of an inline literal to keep the shapes in lockstep.Also applies to: 432-453, 455-495, 495-521
backend/app/services/chat/storage/task_contexts.py (2)
23-62: Task-level context sync logic is fine; drop unuseddbparameterIncrementally merging
new_context_entriesintotask.json["contexts"]["subtask_contexts"]with deduplication onidand marking the JSON as modified is straightforward and should work well for task‑level aggregation. Thedb: Sessionargument tosync_task_contextsis currently unused though; if you don’t expect to need it here, consider removing it to satisfy Ruff’s ARG001 and clarify that the caller is responsible for committing.
65-101: KB context retrieval helpers are efficient and well-scopedFiltering KB context IDs directly from
TaskResource.json["contexts"]["subtask_contexts"]and then loading the correspondingSubtaskContextrows withcontext_type == KNOWLEDGE_BASEavoids extra joins and keeps behavior clearly bounded. The truthyTaskResource.is_activefilter is acceptable, though you may prefer the more explicitTaskResource.is_active == Truestyle for readability.Also applies to: 104-130
frontend/src/features/tasks/components/message/ContextBadgeList.tsx (1)
62-96: Attachment context →Attachmentmapping is reasonableThe status mapping from context statuses to
Attachmentstatuses and the conversion into the minimalAttachmentshape expected byAttachmentPreview(id, filename, size, mime type, extension, status, created_at) looks correct and should render existing UI consistently. IfAttachmentPreviewever surfacescreated_at, you may want to feed a real timestamp from the context (once available onSubtaskContextBrief) instead of an empty string.backend/app/chat_shell/tools/knowledge_factory.py (1)
44-50: Consider extracting duplicated KB meta prompt logic.The pattern of conditionally appending
kb_meta_prompttoenhanced_system_promptappears twice (lines 47-49 and 96-99). While minor, this could be simplified to reduce duplication.🔎 Optional: Extract helper for prompt concatenation
+def _append_kb_meta_if_available(base_prompt: str, db: Any, task_id: Optional[int]) -> str: + """Append historical KB meta prompt if task_id is provided and meta exists.""" + if task_id: + kb_meta_prompt = _build_historical_kb_meta_prompt(db, task_id) + if kb_meta_prompt: + return f"{base_prompt}{kb_meta_prompt}" + return base_promptbackend/app/schemas/subtask.py (1)
87-103: Consider usingContextTypeenum for type safety.The
context_typefield is declared asstr(line 91), but the comparison infrom_model(line 125) uses string literals"attachment"and"knowledge_base". Using theContextTypeenum fromapp.models.subtask_contextwould provide type safety and prevent typos.🔎 Proposed improvement
+from app.models.subtask_context import ContextType + class SubtaskContextBrief(BaseModel): """Brief context info for message list display""" id: int - context_type: str + context_type: ContextType name: str status: str # ...Then update
from_model:- if context.context_type == "attachment": + if context.context_type == ContextType.ATTACHMENT.value:backend/app/services/export/docx_generator.py (1)
326-332: Consider usingoriginal_filenameproperty for caption consistency.Line 329 uses
attachment.namefor the caption, but theSubtaskContextmodel has anoriginal_filenameproperty that provides the intended filename (falling back tonameif not intype_data). Using the property would be more semantically consistent with other file-related code.🔎 Minor suggestion
- run = caption.add_run(attachment.name) + run = caption.add_run(attachment.original_filename)backend/app/models/subtask_context.py (1)
29-43: Enum duplication withbackend/app/schemas/subtask_context.py.These enums (
ContextType,ContextStatus) are defined identically in both the model and schema files. While this provides layer separation, it introduces risk of drift if one is updated without the other.Consider importing from a shared location or consolidating to prevent inconsistencies.
frontend/src/features/tasks/contexts/chatStreamContext.tsx (1)
1-1808: File exceeds recommended 1000-line limit.Per coding guidelines, file size SHOULD NOT exceed 1000 lines. This file is 1808 lines. Consider splitting into sub-modules:
- Message state management
- WebSocket event handlers
- Skill handling
- Stream control utilities
This would improve maintainability and testability.
backend/app/services/context/context_service.py (2)
36-39: Consider adding more context to NotFoundException.The exception class is minimal. Consider adding context like
context_idas an attribute for better error handling upstream.🔎 Proposed enhancement
class NotFoundException(Exception): """Exception raised when a context is not found.""" - - pass + + def __init__(self, message: str, context_id: int | None = None): + super().__init__(message) + self.context_id = context_id
154-157: Uselogger.exceptionfor exception logging.Per static analysis, when logging in exception handlers,
logger.exceptionautomatically includes the stack trace, which aids debugging.🔎 Proposed fix
except StorageError as e: - logger.error(f"Failed to save context {context.id} to storage: {e}") + logger.exception(f"Failed to save context {context.id} to storage") db.rollback() raiseexcept DocumentParseError as e: - logger.error(f"Document parsing failed for context {context.id}: {e}") + logger.exception(f"Document parsing failed for context {context.id}") context.status = ContextStatus.FAILED.value context.error_message = str(e) db.commit() raiseAlso applies to: 185-190
backend/app/services/chat/preprocessing/contexts.py (2)
29-34: Unuseduser_idparameter inprocess_contexts.The
user_idparameter is passed but never used. Either remove it or add a comment explaining it's reserved for future access control checks.🔎 Proposed fix (if removing)
async def process_contexts( db: Session, context_ids: List[int], - user_id: int, message: str, ) -> str | dict[str, Any]:Or if keeping for future use:
async def process_contexts( db: Session, context_ids: List[int], - user_id: int, + user_id: int, # Reserved for future access control checks message: str, ) -> str | dict[str, Any]:
79-81: Consider usinglogger.exceptionfor full stack traces.Per static analysis, replacing
logger.errorwithlogger.exceptionin exception handlers automatically captures the stack trace, which aids debugging.🔎 Proposed fix
except Exception as e: - logger.error(f"Error processing context {context_id}: {e}") + logger.exception(f"Error processing context {context_id}") continueAlso applies to: 435-437
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (40)
backend/alembic/versions/o5p6q7r8s9t0_add_subtask_contexts_table.pybackend/app/api/endpoints/adapter/attachments.pybackend/app/api/ws/chat_namespace.pybackend/app/api/ws/events.pybackend/app/chat_shell/history/loader.pybackend/app/chat_shell/tools/builtin/knowledge_base.pybackend/app/chat_shell/tools/knowledge_factory.pybackend/app/core/config.pybackend/app/models/__init__.pybackend/app/models/knowledge.pybackend/app/models/subtask.pybackend/app/models/subtask_context.pybackend/app/schemas/subtask.pybackend/app/schemas/subtask_context.pybackend/app/schemas/task.pybackend/app/services/adapters/task_kinds.pybackend/app/services/attachment/mysql_storage.pybackend/app/services/chat/operations/retry.pybackend/app/services/chat/preprocessing/__init__.pybackend/app/services/chat/preprocessing/attachments.pybackend/app/services/chat/preprocessing/contexts.pybackend/app/services/chat/storage/task_contexts.pybackend/app/services/chat/trigger/core.pybackend/app/services/context/__init__.pybackend/app/services/context/context_service.pybackend/app/services/export/docx_generator.pybackend/app/services/knowledge_service.pybackend/app/services/rag/document_service.pybackend/app/services/shared_task.pybackend/app/services/subtask.pyfrontend/src/features/tasks/components/chat/ChatArea.tsxfrontend/src/features/tasks/components/chat/useChatAreaState.tsfrontend/src/features/tasks/components/chat/useChatStreamHandlers.tsxfrontend/src/features/tasks/components/message/ContextBadgeList.tsxfrontend/src/features/tasks/components/message/MessageBubble.tsxfrontend/src/features/tasks/components/message/MessagesArea.tsxfrontend/src/features/tasks/contexts/chatStreamContext.tsxfrontend/src/features/tasks/hooks/useUnifiedMessages.tsfrontend/src/types/api.tsfrontend/src/types/socket.ts
💤 Files with no reviewable changes (1)
- backend/app/services/chat/preprocessing/attachments.py
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{py,ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{py,ts,tsx,js,jsx}: All code comments MUST be written in English
File size MUST NOT exceed 1000 lines - split into multiple sub-modules if exceeded
Function length SHOULD NOT exceed 50 lines (preferred)
Files:
frontend/src/types/socket.tsbackend/app/services/chat/operations/retry.pyfrontend/src/features/tasks/components/message/ContextBadgeList.tsxfrontend/src/features/tasks/components/chat/useChatAreaState.tsfrontend/src/features/tasks/components/chat/useChatStreamHandlers.tsxbackend/app/core/config.pyfrontend/src/features/tasks/components/chat/ChatArea.tsxfrontend/src/features/tasks/components/message/MessagesArea.tsxbackend/app/services/context/__init__.pybackend/app/services/adapters/task_kinds.pybackend/app/services/rag/document_service.pybackend/app/services/attachment/mysql_storage.pybackend/app/services/knowledge_service.pybackend/app/chat_shell/history/loader.pybackend/alembic/versions/o5p6q7r8s9t0_add_subtask_contexts_table.pyfrontend/src/features/tasks/hooks/useUnifiedMessages.tsbackend/app/services/subtask.pybackend/app/models/knowledge.pybackend/app/schemas/task.pybackend/app/models/subtask.pybackend/app/services/chat/storage/task_contexts.pybackend/app/schemas/subtask.pybackend/app/services/shared_task.pybackend/app/chat_shell/tools/knowledge_factory.pybackend/app/services/chat/trigger/core.pybackend/app/models/__init__.pyfrontend/src/types/api.tsbackend/app/api/ws/events.pybackend/app/services/chat/preprocessing/__init__.pybackend/app/services/export/docx_generator.pybackend/app/schemas/subtask_context.pybackend/app/api/ws/chat_namespace.pybackend/app/models/subtask_context.pyfrontend/src/features/tasks/components/message/MessageBubble.tsxbackend/app/chat_shell/tools/builtin/knowledge_base.pybackend/app/api/endpoints/adapter/attachments.pybackend/app/services/context/context_service.pyfrontend/src/features/tasks/contexts/chatStreamContext.tsxbackend/app/services/chat/preprocessing/contexts.py
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
TypeScript/React MUST use strict mode, functional components, Prettier for formatting, ESLint for linting, single quotes, and no semicolons
Files:
frontend/src/types/socket.tsfrontend/src/features/tasks/components/message/ContextBadgeList.tsxfrontend/src/features/tasks/components/chat/useChatAreaState.tsfrontend/src/features/tasks/components/chat/useChatStreamHandlers.tsxfrontend/src/features/tasks/components/chat/ChatArea.tsxfrontend/src/features/tasks/components/message/MessagesArea.tsxfrontend/src/features/tasks/hooks/useUnifiedMessages.tsfrontend/src/types/api.tsfrontend/src/features/tasks/components/message/MessageBubble.tsxfrontend/src/features/tasks/contexts/chatStreamContext.tsx
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
TypeScript MUST use
constoverlet, never usevar
Files:
frontend/src/types/socket.tsfrontend/src/features/tasks/components/message/ContextBadgeList.tsxfrontend/src/features/tasks/components/chat/useChatAreaState.tsfrontend/src/features/tasks/components/chat/useChatStreamHandlers.tsxfrontend/src/features/tasks/components/chat/ChatArea.tsxfrontend/src/features/tasks/components/message/MessagesArea.tsxfrontend/src/features/tasks/hooks/useUnifiedMessages.tsfrontend/src/types/api.tsfrontend/src/features/tasks/components/message/MessageBubble.tsxfrontend/src/features/tasks/contexts/chatStreamContext.tsx
frontend/src/types/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
TypeScript types MUST be organized in
src/types/directory
Files:
frontend/src/types/socket.tsfrontend/src/types/api.ts
frontend/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
frontend/src/**/*.{ts,tsx}: MUST useuseTranslationhook imported from@/hooks/useTranslation, not fromreact-i18next
MUST use single namespace withuseTranslation()- never use array format likeuseTranslation(['common', 'groups'])
Frontend message data MUST always usemessagesfromuseUnifiedMessages()hook as the single source of truth for displaying messages - never useselectedTaskDetail.subtasks
Frontend i18n translation keys MUST use current namespace formatt('key.subkey')for keys within namespace andt('namespace:key.subkey')for cross-namespace keys
Files:
frontend/src/types/socket.tsfrontend/src/features/tasks/components/message/ContextBadgeList.tsxfrontend/src/features/tasks/components/chat/useChatAreaState.tsfrontend/src/features/tasks/components/chat/useChatStreamHandlers.tsxfrontend/src/features/tasks/components/chat/ChatArea.tsxfrontend/src/features/tasks/components/message/MessagesArea.tsxfrontend/src/features/tasks/hooks/useUnifiedMessages.tsfrontend/src/types/api.tsfrontend/src/features/tasks/components/message/MessageBubble.tsxfrontend/src/features/tasks/contexts/chatStreamContext.tsx
frontend/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Frontend MUST only use
NEXT_PUBLIC_*environment variables for client-safe values
Files:
frontend/src/types/socket.tsfrontend/src/features/tasks/components/message/ContextBadgeList.tsxfrontend/src/features/tasks/components/chat/useChatAreaState.tsfrontend/src/features/tasks/components/chat/useChatStreamHandlers.tsxfrontend/src/features/tasks/components/chat/ChatArea.tsxfrontend/src/features/tasks/components/message/MessagesArea.tsxfrontend/src/features/tasks/hooks/useUnifiedMessages.tsfrontend/src/types/api.tsfrontend/src/features/tasks/components/message/MessageBubble.tsxfrontend/src/features/tasks/contexts/chatStreamContext.tsx
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Python code MUST follow PEP 8, use Black formatter with line length 88, and isort for imports
Python code MUST include type hints
Python functions and classes MUST have descriptive names and docstrings for public functions/classes
Python MUST extract magic numbers to named constants
Files:
backend/app/services/chat/operations/retry.pybackend/app/core/config.pybackend/app/services/context/__init__.pybackend/app/services/adapters/task_kinds.pybackend/app/services/rag/document_service.pybackend/app/services/attachment/mysql_storage.pybackend/app/services/knowledge_service.pybackend/app/chat_shell/history/loader.pybackend/alembic/versions/o5p6q7r8s9t0_add_subtask_contexts_table.pybackend/app/services/subtask.pybackend/app/models/knowledge.pybackend/app/schemas/task.pybackend/app/models/subtask.pybackend/app/services/chat/storage/task_contexts.pybackend/app/schemas/subtask.pybackend/app/services/shared_task.pybackend/app/chat_shell/tools/knowledge_factory.pybackend/app/services/chat/trigger/core.pybackend/app/models/__init__.pybackend/app/api/ws/events.pybackend/app/services/chat/preprocessing/__init__.pybackend/app/services/export/docx_generator.pybackend/app/schemas/subtask_context.pybackend/app/api/ws/chat_namespace.pybackend/app/models/subtask_context.pybackend/app/chat_shell/tools/builtin/knowledge_base.pybackend/app/api/endpoints/adapter/attachments.pybackend/app/services/context/context_service.pybackend/app/services/chat/preprocessing/contexts.py
backend/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
backend/**/*.py: Python backend module imports MUST useuv runprefix when executing commands
Task and Workspace resources MUST use TaskResource model fromapp.models.task, not the Kind model
Ghost, Model, Shell, Bot, Team, and Skill CRDs MUST use Kind model fromapp.models.kind
Files:
backend/app/services/chat/operations/retry.pybackend/app/core/config.pybackend/app/services/context/__init__.pybackend/app/services/adapters/task_kinds.pybackend/app/services/rag/document_service.pybackend/app/services/attachment/mysql_storage.pybackend/app/services/knowledge_service.pybackend/app/chat_shell/history/loader.pybackend/alembic/versions/o5p6q7r8s9t0_add_subtask_contexts_table.pybackend/app/services/subtask.pybackend/app/models/knowledge.pybackend/app/schemas/task.pybackend/app/models/subtask.pybackend/app/services/chat/storage/task_contexts.pybackend/app/schemas/subtask.pybackend/app/services/shared_task.pybackend/app/chat_shell/tools/knowledge_factory.pybackend/app/services/chat/trigger/core.pybackend/app/models/__init__.pybackend/app/api/ws/events.pybackend/app/services/chat/preprocessing/__init__.pybackend/app/services/export/docx_generator.pybackend/app/schemas/subtask_context.pybackend/app/api/ws/chat_namespace.pybackend/app/models/subtask_context.pybackend/app/chat_shell/tools/builtin/knowledge_base.pybackend/app/api/endpoints/adapter/attachments.pybackend/app/services/context/context_service.pybackend/app/services/chat/preprocessing/contexts.py
backend/alembic/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Database migrations MUST use Alembic with
alembic revision --autogenerateto create new migrations
Files:
backend/alembic/versions/o5p6q7r8s9t0_add_subtask_contexts_table.py
backend/app/api/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
API routes MUST use CRD names (Team, Bot) in path names and database models
Files:
backend/app/api/ws/events.pybackend/app/api/ws/chat_namespace.pybackend/app/api/endpoints/adapter/attachments.py
🧠 Learnings (5)
📚 Learning: 2025-12-31T03:47:12.160Z
Learnt from: CR
Repo: wecode-ai/Wegent PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-31T03:47:12.160Z
Learning: Applies to frontend/src/**/*.{ts,tsx} : Frontend message data MUST always use `messages` from `useUnifiedMessages()` hook as the single source of truth for displaying messages - never use `selectedTaskDetail.subtasks`
Applied to files:
frontend/src/features/tasks/components/message/ContextBadgeList.tsxfrontend/src/features/tasks/components/message/MessagesArea.tsxfrontend/src/features/tasks/hooks/useUnifiedMessages.tsfrontend/src/types/api.tsfrontend/src/features/tasks/components/message/MessageBubble.tsxfrontend/src/features/tasks/contexts/chatStreamContext.tsx
📚 Learning: 2025-12-18T02:09:09.776Z
Learnt from: CR
Repo: wecode-ai/Wegent PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-18T02:09:09.776Z
Learning: Applies to frontend/src/**/*.tsx : Frontend component files named with CRD terms: team-list.tsx, bot-form.tsx, model-selector.tsx
Applied to files:
frontend/src/features/tasks/components/message/ContextBadgeList.tsxfrontend/src/features/tasks/components/message/MessageBubble.tsx
📚 Learning: 2025-12-18T02:09:09.776Z
Learnt from: CR
Repo: wecode-ai/Wegent PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-18T02:09:09.776Z
Learning: Applies to backend/app/services/attachment/**/*.py : Backend MUST support pluggable attachment storage backends: mysql, s3, minio via `ATTACHMENT_STORAGE_BACKEND` environment variable
Applied to files:
backend/app/services/rag/document_service.pybackend/app/services/attachment/mysql_storage.pybackend/app/api/endpoints/adapter/attachments.py
📚 Learning: 2025-12-18T02:09:09.776Z
Learnt from: CR
Repo: wecode-ai/Wegent PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-18T02:09:09.776Z
Learning: Backend database migrations: run `alembic revision --autogenerate -m 'description'` to create migrations, `alembic upgrade head` to apply, always test migrations before committing
Applied to files:
backend/alembic/versions/o5p6q7r8s9t0_add_subtask_contexts_table.py
📚 Learning: 2025-12-31T03:47:12.160Z
Learnt from: CR
Repo: wecode-ai/Wegent PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-31T03:47:12.160Z
Learning: Applies to backend/**/*.py : Task and Workspace resources MUST use TaskResource model from `app.models.task`, not the Kind model
Applied to files:
backend/app/schemas/task.pybackend/app/services/shared_task.pybackend/app/models/__init__.pybackend/app/services/export/docx_generator.py
🧬 Code graph analysis (20)
backend/app/services/chat/operations/retry.py (1)
backend/app/models/subtask.py (1)
Subtask(38-88)
frontend/src/features/tasks/components/message/ContextBadgeList.tsx (4)
backend/app/schemas/subtask.py (1)
SubtaskContextBrief(87-140)backend/app/schemas/subtask_context.py (1)
SubtaskContextBrief(79-109)frontend/src/types/api.ts (2)
SubtaskContextBrief(506-517)Attachment(473-486)frontend/src/features/tasks/components/input/AttachmentPreview.tsx (1)
AttachmentPreview(246-416)
frontend/src/features/tasks/components/chat/useChatStreamHandlers.tsx (1)
backend/app/api/ws/events.py (1)
ContextItem(83-87)
frontend/src/features/tasks/components/message/MessagesArea.tsx (2)
frontend/src/features/tasks/components/message/MessageBubble.tsx (1)
Message(47-96)frontend/src/features/tasks/components/message/index.ts (1)
Message(2-2)
backend/app/services/context/__init__.py (1)
backend/app/services/context/context_service.py (1)
ContextService(42-925)
backend/app/services/adapters/task_kinds.py (2)
backend/app/services/base.py (1)
update(69-90)backend/app/models/subtask_context.py (4)
file_extension(117-121)file_size(124-128)mime_type(131-135)document_count(161-165)
backend/app/services/rag/document_service.py (2)
backend/app/models/subtask_context.py (7)
ContextStatus(36-43)ContextType(29-33)SubtaskContext(46-179)storage_backend(145-149)storage_key(138-142)original_filename(110-114)file_extension(117-121)backend/app/services/context/context_service.py (1)
get_attachment_binary_data(204-235)
backend/app/services/attachment/mysql_storage.py (2)
backend/app/models/subtask_context.py (1)
SubtaskContext(46-179)backend/app/services/attachment/storage_backend.py (1)
StorageError(106-112)
backend/app/services/knowledge_service.py (1)
backend/app/services/context/context_service.py (1)
delete_context(852-900)
backend/app/chat_shell/history/loader.py (4)
backend/app/models/subtask_context.py (4)
ContextStatus(36-43)ContextType(29-33)SubtaskContext(46-179)knowledge_id(154-158)backend/app/services/chat/storage/db.py (1)
_db_session(59-71)backend/app/models/subtask.py (1)
SubtaskRole(26-28)backend/app/services/context/context_service.py (4)
build_vision_content_block(281-302)build_document_text_prefix(304-335)build_knowledge_base_text_prefix(498-534)get_knowledge_base_meta_for_task(598-648)
backend/app/services/subtask.py (1)
backend/app/models/subtask.py (1)
Subtask(38-88)
backend/app/services/chat/storage/task_contexts.py (2)
backend/app/models/subtask_context.py (1)
SubtaskContext(46-179)backend/app/models/task.py (1)
TaskResource(26-77)
backend/app/schemas/subtask.py (2)
backend/app/schemas/subtask_context.py (5)
SubtaskContextBrief(79-109)Config(70-71)Config(93-94)Config(132-133)from_model(97-109)backend/app/models/subtask_context.py (4)
file_extension(117-121)file_size(124-128)mime_type(131-135)document_count(161-165)
backend/app/services/shared_task.py (1)
backend/app/models/subtask_context.py (6)
ContextType(29-33)SubtaskContext(46-179)original_filename(110-114)file_extension(117-121)file_size(124-128)mime_type(131-135)
backend/app/chat_shell/tools/knowledge_factory.py (4)
backend/app/services/chat/preprocessing/contexts.py (1)
_build_historical_kb_meta_prompt(602-622)backend/app/chat_shell/tools/builtin/knowledge_base.py (1)
KnowledgeBaseTool(32-295)backend/app/chat_shell/history/loader.py (1)
get_knowledge_base_meta_prompt(244-285)backend/app/services/context/context_service.py (1)
get_knowledge_base_meta_for_task(598-648)
backend/app/models/__init__.py (1)
backend/app/models/subtask_context.py (1)
SubtaskContext(46-179)
backend/app/api/ws/events.py (1)
backend/app/chat_shell/tools/builtin/knowledge_base.py (1)
description(60-77)
backend/app/services/chat/preprocessing/__init__.py (1)
backend/app/services/chat/preprocessing/contexts.py (6)
extract_knowledge_base_ids(158-184)get_attachment_context_ids_from_subtask(646-661)get_knowledge_base_ids_from_subtask(625-643)link_contexts_to_subtask(206-327)process_attachments(135-155)process_contexts(29-101)
backend/app/services/export/docx_generator.py (1)
backend/app/models/subtask_context.py (6)
SubtaskContext(46-179)ContextType(29-33)file_extension(117-121)storage_backend(145-149)storage_key(138-142)file_size(124-128)
backend/app/models/subtask_context.py (3)
backend/app/schemas/subtask_context.py (2)
ContextType(19-23)ContextStatus(26-33)frontend/src/types/api.ts (2)
ContextType(503-503)ContextStatus(504-504)backend/app/services/attachment/mysql_storage.py (1)
get(97-122)
🪛 Ruff (0.14.10)
backend/app/services/rag/document_service.py
72-72: Avoid specifying long messages outside the exception class
(TRY003)
76-78: Avoid specifying long messages outside the exception class
(TRY003)
92-94: Avoid specifying long messages outside the exception class
(TRY003)
backend/app/services/attachment/mysql_storage.py
76-76: Abstract raise to an inner function
(TRY301)
76-76: Avoid specifying long messages outside the exception class
(TRY003)
118-118: Consider moving this statement to an else block
(TRY300)
219-219: Abstract raise to an inner function
(TRY301)
219-219: Avoid specifying long messages outside the exception class
(TRY003)
225-227: Abstract raise to an inner function
(TRY301)
225-227: Avoid specifying long messages outside the exception class
(TRY003)
backend/app/services/knowledge_service.py
806-806: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backend/alembic/versions/o5p6q7r8s9t0_add_subtask_contexts_table.py
90-118: Possible SQL injection vector through string-based query construction
(S608)
backend/app/services/chat/storage/task_contexts.py
24-24: Unused function argument: db
(ARG001)
backend/app/chat_shell/tools/knowledge_factory.py
126-126: Do not catch blind exception: Exception
(BLE001)
149-149: Do not catch blind exception: Exception
(BLE001)
backend/app/services/context/context_service.py
93-96: Avoid specifying long messages outside the exception class
(TRY003)
102-102: Avoid specifying long messages outside the exception class
(TRY003)
155-155: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
186-186: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
679-679: Avoid specifying long messages outside the exception class
(TRY003)
backend/app/services/chat/preprocessing/contexts.py
32-32: Unused function argument: user_id
(ARG001)
79-79: Do not catch blind exception: Exception
(BLE001)
80-80: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
270-270: Do not catch blind exception: Exception
(BLE001)
324-324: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
435-435: Do not catch blind exception: Exception
(BLE001)
436-436: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
620-620: Do not catch blind exception: Exception
(BLE001)
| @classmethod | ||
| def from_model(cls, context) -> "SubtaskContextBrief": | ||
| """Create brief from SubtaskContext model.""" | ||
| type_data = context.type_data or {} | ||
| return cls( | ||
| id=context.id, | ||
| context_type=context.context_type, | ||
| name=context.name, | ||
| status=context.status, | ||
| file_extension=type_data.get("file_extension"), | ||
| file_size=type_data.get("file_size"), | ||
| mime_type=type_data.get("mime_type"), | ||
| document_count=type_data.get("document_count"), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add type hint for context parameter in from_model.
The context parameter lacks a type hint, which reduces IDE support and type safety. Per coding guidelines, Python code MUST include type hints.
🔎 Proposed fix
+ from typing import TYPE_CHECKING
+
+ if TYPE_CHECKING:
+ from app.models.subtask_context import SubtaskContext
+
@classmethod
- def from_model(cls, context) -> "SubtaskContextBrief":
+ def from_model(cls, context: "SubtaskContext") -> "SubtaskContextBrief":
"""Create brief from SubtaskContext model."""Or import directly if circular imports are not a concern:
@classmethod
- def from_model(cls, context) -> "SubtaskContextBrief":
+ def from_model(cls, context: "app.models.subtask_context.SubtaskContext") -> "SubtaskContextBrief":
"""Create brief from SubtaskContext model."""🤖 Prompt for AI Agents
In backend/app/schemas/subtask_context.py around lines 96 to 109, the from_model
method parameter context is missing a type hint; add a proper type annotation
(e.g. SubtaskContext) to improve type safety and IDE support. If importing the
model causes a circular import, wrap the import in a TYPE_CHECKING block (from
typing import TYPE_CHECKING; if TYPE_CHECKING: from
app.models.subtask_context_model import SubtaskContext) and annotate the
parameter with a forward reference string "SubtaskContext" or import directly if
safe; update the method signature to use that type.
| @classmethod | ||
| def from_context( | ||
| cls, | ||
| context, | ||
| truncation_info: Optional[TruncationInfo] = None, | ||
| ) -> "AttachmentResponse": | ||
| """Create from SubtaskContext model.""" | ||
| type_data = context.type_data or {} | ||
| return cls( | ||
| id=context.id, | ||
| filename=type_data.get("original_filename", context.name), | ||
| file_size=type_data.get("file_size", 0), | ||
| mime_type=type_data.get("mime_type", ""), | ||
| status=( | ||
| context.status | ||
| if isinstance(context.status, str) | ||
| else context.status.value | ||
| ), | ||
| file_extension=type_data.get("file_extension", ""), | ||
| text_length=context.text_length, | ||
| error_message=context.error_message, | ||
| truncation_info=truncation_info, | ||
| created_at=context.created_at, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add type hints to from_context methods as well.
Both AttachmentResponse.from_context and AttachmentDetailResponse.from_context have untyped context parameters. Apply the same type hint pattern for consistency.
Also applies to: 166-190
🤖 Prompt for AI Agents
In backend/app/schemas/subtask_context.py around lines 135-158 (and similarly
for 166-190), the from_context methods accept an untyped context parameter; add
a concrete type hint for the context parameter (e.g., SubtaskContext or the
actual ORM/Pydantic model type used elsewhere in the codebase) and import that
type at the top of the file, updating the signature to context: SubtaskContext
(and keep the existing return type), and make the identical change to
AttachmentDetailResponse.from_context in the 166-190 range so both methods use
the same typed context pattern.
| export function ContextBadgeList({ contexts }: ContextBadgeListProps) { | ||
| // DEBUG: Log contexts to help diagnose display issues | ||
| console.log('[ContextBadgeList] Rendering with contexts:', contexts); | ||
|
|
||
| if (!contexts || contexts.length === 0) { | ||
| console.log('[ContextBadgeList] No contexts to display'); | ||
| return null; | ||
| } | ||
|
|
||
| return ( | ||
| <div className="flex flex-wrap gap-2 mb-3"> | ||
| {contexts.map(context => ( | ||
| <ContextBadgeItem key={`${context.context_type}-${context.id}`} context={context} /> | ||
| ))} | ||
| </div> | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid persistent debug logging in ContextBadgeList
The console.log calls on every render (including the “no contexts” branch) will be noisy in production and can impact performance when many messages are present. Recommend removing them or gating behind a debug flag.
🤖 Prompt for AI Agents
In frontend/src/features/tasks/components/message/ContextBadgeList.tsx around
lines 25 to 41, remove the persistent console.log debug statements (both the
render log and the "no contexts" log) or gate them behind a runtime debug flag
so they do not execute in production; implement the fix by either deleting the
two console.log lines or wrapping them in a conditional like checking a DEBUG
env/prop (e.g., process.env.NODE_ENV !== 'production' or a passed debug prop)
before logging so normal renders remain silent in production.
This change implements task-level context aggregation, enabling global knowledge base visibility across all subtasks in a task. This is particularly useful for group chat scenarios where new members can access task-level knowledge bases without individual permissions.
Key changes:
Task-level context storage:
Incremental sync mechanism:
Priority-based context resolution:
Enhanced KnowledgeBaseTool:
WebSocket integration:
Benefits:
Summary by CodeRabbit
New Features
Bug Fixes & Performance
Database
✏️ Tip: You can customize this high-level summary in your review settings.