Skip to content

[Refractor] refract chat: 重构了当前算法chat部分的内容#22

Open
chenhao0205 wants to merge 15 commits intoLazyAGI:mainfrom
chenhao0205:ch/refractor
Open

[Refractor] refract chat: 重构了当前算法chat部分的内容#22
chenhao0205 wants to merge 15 commits intoLazyAGI:mainfrom
chenhao0205:ch/refractor

Conversation

@chenhao0205
Copy link
Copy Markdown

@chenhao0205 chenhao0205 commented Apr 7, 2026

变更概述

本次PR对LazyRAG的chat模块进行了全面的架构重构,将原有的扁平化结构改造为更清晰的分层组件化设计。

主要改动

1. 目录结构重组

  • 新增 app/ 目录: 包含FastAPI应用入口、路由定义和服务核心逻辑

    • api/: 路由层 (chat_routes.py, health_routes.py)
    • core/: 服务核心 (chat_server.py, chat_service.py)
    • chat.py: 精简为服务启动入口
  • 新增 components/ 目录: 按功能域划分的组件层

    • generate/: 生成相关组件(结果聚合、提示词格式化、输出解析)
    • process/: 处理组件(自适应topk选择、敏感词过滤、多轮查询改写、上下文扩展)
    • tools/: 工具组件(SQL工具、工具注册)
    • tmp/: 临时/本地模型组件
  • 新增 pipelines/ 目录: 核心流水线定义

    • builders/: Pipeline构建器(模型获取、搜索pipeline、生成pipeline)
    • agentic.py / naive.py: 两种RAG模式实现
  • 新增 utils/ 目录: 工具层统一封装

    • schema.py: 数据模型定义(合并原message.py
    • config.py: 配置管理
    • helpers.py: 辅助函数

2. 关键优化

  • 统一模型加载入口 get_automodel(),支持配置驱动和流式包装
  • 新增上下文扩展组件 ContextExpansionComponent,支持检索结果上下文增强
  • 删除原先大量冗余的pydantic对象
  • 删除冗余文件:chat.py(旧)、modules/目录、chat_pipelines/目录

目录结构重构对比

重构前 重构后
chat/
├── chat.py
├── chat_pipelines/
│   ├── agentic.py
│   ├── naive.py
│   └── configs/
│       └── auto_model.yaml
├── data/
│   └── sensitive_words.txt
├── modules/
│   ├── algo/
│   │   ├── adaptive_topk.py
│   │   ├── multiturn_query_rewriter.py
│   │   └── prompt_formatter.py
│   └── engineering/
│       ├── aggregate.py
│       ├── load_model.py
│       ├── output_parser.py
│       ├── sensitive_filter.py
│       ├── simple_llm.py
│       ├── tool_registry.py
│       └── workflow_utils.py
├── prompts/
│   └── agentic.py
├── tools/
│   └── sql.py
└── utils/
    ├── message.py
    ├── stream_scanner.py
    └── url.py
chat/
├── app/                          # 新增:服务入口层
│   ├── __init__.py
│   ├── chat.py                   # 精简为启动入口
│   ├── api/                      # 路由层
│   │   ├── __init__.py
│   │   ├── chat_routes.py
│   │   └── health_routes.py
│   └── core/                     # 服务核心
│       ├── chat_server.py
│       └── chat_service.py
├── components/                   # 新增:组件层
│   ├── __init__.py
│   ├── generate/                 # 生成组件
│   │   ├── __init__.py
│   │   ├── aggregate.py      ← modules/engineering/
│   │   ├── prompt_formatter.py   ← modules/algo/
│   │   └── output_parser.py  ← modules/engineering/
│   ├── process/                  # 处理组件
│   │   ├── __init__.py
│   │   ├── sensitive_filter.py   ← modules/engineering/
│   │   ├── multiturn_query_rewriter.py  ← modules/algo/
│   │   ├── adaptive_topk.py  ← modules/algo/
│   │   └── context_expansion.py  # 新增
│   └── tmp/                      # 本地模型
│       ├── __init__.py
│       └── local_models.py   ← modules/engineering/load_model.py
├── pipelines/                    # 重组:核心流水线
│   ├── __init__.py
│   ├── agentic.py            ← chat_pipelines/
│   ├── naive.py              ← chat_pipelines/
│   └── builders/                 # 新增:构建器
│       ├── get_models.py
│       ├── get_retriever.py
│       ├── get_ppl_search.py
│       └── get_ppl_generate.py
├── prompts/
│   ├── agentic.py
│   ├── rag_answer.py         # 新增
│   └── rewrite.py            # 新增
├── utils/
│   ├── __init__.py
│   ├── schema.py             # 合并 message.py
│   ├── config.py             # 新增
│   ├── helpers.py            # 新增
│   ├── stream_scanner.py
│   └── url.py
├── data/
│   └── sensitive_words.txt
└── auto_model.yaml           ← chat_pipelines/configs/

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modularizes the chat application by restructuring API routes, core services, and RAG pipelines, and introduces a centralized configuration system. The review identified several critical issues, including blocking I/O in an asynchronous health check, a syntax error in module exports due to a missing comma, and incorrect import paths for configuration files. Furthermore, a path traversal vulnerability was found in the file validation logic, and improvements were suggested to prevent temporary file leaks and replace inefficient polling in streaming responses with more robust asynchronous patterns.

chenhao0205 and others added 4 commits April 7, 2026 17:48
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant