Skip to content

Trae/solo agent 3s xmqh#460

Closed
chenzhenguo wants to merge 11 commits intocita-777:mainfrom
chenzhenguo:trae/solo-agent-3sXmqh
Closed

Trae/solo agent 3s xmqh#460
chenzhenguo wants to merge 11 commits intocita-777:mainfrom
chenzhenguo:trae/solo-agent-3sXmqh

Conversation

@chenzhenguo
Copy link
Copy Markdown

@chenzhenguo chenzhenguo commented Apr 9, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Backup import now runs asynchronously as a background task, returning immediately with job ID and status tracking.
    • Enhanced timeout configuration for long-running database operations (300+ seconds) across all layers.
  • Documentation

    • Added timeout configuration best practices guide covering all system layers.
    • Added comprehensive architecture and operations guidance documentation.
  • Tests

    • Added test coverage for long-running database operations and backup scenarios.

MonkeyCode-AI and others added 11 commits April 9, 2026 03:42
- Add requestTimeout: 300_000 (5min) to handle large backup imports
- Add keepAliveTimeout: 65_000 for connection management
- Fixes 30s timeout issue on /api/settings/backup/import endpoint
Co-authored-by: monkeycode-ai <[email protected]>
- 新增 batchQueryAll 分页查询工具函数,每次最多加载10条记录
- 新增 batchInsertHelper 分批插入工具函数,每批最多插入10条记录
- 重构 collectCurrentRuntimeStateSnapshot() 使用分页加载 proxyLogs 和 checkinLogs
- 重构 importAccountsSection() 将所有表的插入改为批量执行
- 解决大数据量导入时内存暴涨导致的502问题
Co-authored-by: monkeycode-ai <[email protected]>
feat: 优化超时处理并提升备份服务性能
@github-actions github-actions Bot added area: db Database and schema related changes area: docs Docs and README changes area: web Web UI changes size: XXL 2000 or more lines changed labels Apr 9, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 9, 2026

📝 Walkthrough

Walkthrough

This PR introduces comprehensive database and timeout optimization spanning configuration, batch operation enhancements, and asynchronous backup handling. Changes include server/database connection timeout configuration (300s targets), batch query/insert utilities for backup operations, migration of backup import from synchronous to asynchronous background task handling, long-running operation test infrastructure, and supporting specification/planning documents with timeout best practices.

Changes

Cohort / File(s) Summary
Timeout Optimization Specifications
.trae/specs/timeout-optimization/spec.md, .trae/specs/timeout-optimization/tasks.md, .trae/specs/timeout-optimization/checklist.md, .trae/specs/database-timeout-optimization/spec.md, .trae/specs/database-timeout-optimization/tasks.md, .trae/specs/database-timeout-optimization/checklist.md
Added PRDs and implementation plans defining timeout targets (300+ seconds), requirements across server/Docker/database/gateway layers, and verification checklists for timeout consistency and large-operation validation.
Batch Operation Documentation
.monkeycode/specs/db-batch-optimization/tasklist.md
Checklist documenting pagination and batch insertion optimizations, including batchQueryAll and batchInsertHelper utilities and their application across snapshot collection and account import flows.
Code & Architecture Wiki
CODE_WIKI.md
Comprehensive project documentation covering architecture, directory structure, technology stack, configuration, data models, scheduling, routing, and development/deployment guidance.
Database Configuration
src/server/db/index.ts
Configured MySQL connection pool (waitForConnections: true, connectionLimit: 10, connectTimeout: 300000) and PostgreSQL pool (max: 10, idleTimeoutMillis: 300000, connectionTimeoutMillis: 300000).
Fastify Server Configuration
src/server/config.ts
Added Fastify server timeouts: requestTimeout: 300_000 and keepAliveTimeout: 65_000.
Backup Service Refactoring
src/server/services/backupService.ts
Major refactoring introducing batchQueryAll and batchInsertHelper utilities, switching log reads to paginated batches, converting import from delete-all-insert-all to upsert-like conditional updates, adding new stats counters (newSites, updatedSites, etc.), and accumulating non-fatal errors instead of throwing immediately. Return types for importAccountsSection and importPreferencesSection changed to include summary stats.
Backup API Route
src/server/routes/api/settings.ts
Transitioned backup import from synchronous await importBackup() to asynchronous startBackgroundTask() with background execution, returning 202 status with job ID and polling support instead of immediate 200 response.
Web API Client
src/web/api.ts
Extended fetch timeouts for importOAuthConnections, importBackup, and importBackupFromWebdav to 300_000 ms.
Long-Running Operation Tests
src/server/db/longRunningOperations.test.ts, src/server/services/backupService.longRunning.test.ts
Added Vitest test suites for database operations exceeding 60 seconds (batched inserts, sequential operations, connection stability) and for large backup import/export scenarios with data generation helpers and row count validation.
Long-Running Test Scripts
test-long-running.js, test-long-running-backup.js
Standalone Node scripts exercising database operations (50 batches × 1000 rows with delays) and backup import workflows with background task polling and validation.
Documentation Updates
docs/configuration.md, docs/operations.md, LONG_RUNNING_OPERATIONS_TEST_PLAN.md
Added timeout configuration best practices across layers (database, gateway, server, Docker), updated log-handling guidance with timeout event monitoring, and created comprehensive test plan for long-running database operations.

Sequence Diagram

sequenceDiagram
    actor User
    participant WebUI
    participant APIRoute
    participant BackgroundTask
    participant BackupService
    participant Database

    rect rgba(100, 200, 150, 0.5)
    note over User,Database: Previous Synchronous Flow
    User->>WebUI: Import Backup
    WebUI->>APIRoute: POST /api/settings/backup/import
    APIRoute->>BackupService: await importBackup(data)
    BackupService->>Database: Read existing records
    BackupService->>Database: Delete all, Insert new (large batches)
    Database-->>BackupService: Complete (may timeout on large ops)
    BackupService-->>APIRoute: Success/Error
    APIRoute-->>WebUI: 200 with result
    WebUI-->>User: Display result
    end

    rect rgba(150, 150, 200, 0.5)
    note over User,Database: New Asynchronous Background Flow
    User->>WebUI: Import Backup
    WebUI->>APIRoute: POST /api/settings/backup/import (300s timeout)
    APIRoute->>BackgroundTask: startBackgroundTask(maintenance job)
    BackgroundTask-->>APIRoute: jobId + reused status
    APIRoute-->>WebUI: 202 Accepted {jobId, queued: true}
    WebUI-->>User: Show queued status
    
    par Background Processing
        BackgroundTask->>BackupService: await importBackup(data)
        BackupService->>Database: Read existing via batchQueryAll()
        BackupService->>Database: Upsert records with conditional logic
        BackupService->>Database: Batch insert via batchInsertHelper()
        Database-->>BackupService: Complete (batched, with timeouts configured)
        BackupService-->>BackgroundTask: Result + Stats (new/updated counts)
    and Client Polling
        WebUI->>APIRoute: Poll getBackgroundTask(jobId)
        APIRoute-->>WebUI: {status: 'processing'}
        loop Until Complete
            WebUI->>APIRoute: Poll again
            APIRoute-->>WebUI: {status: 'succeeded', result}
        end
        WebUI-->>User: Display final result
    end
    end
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly Related PRs

Suggested Labels

area: db, area: server, area: scripts, area: docs, size: XXL

Poem

🐰 Batches bloom where timeouts grow,
Long ops run, no more timeout woe,
Background tasks now dance and queue,
Database pools with limits true,
Hoppy optimization through and through!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.05% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The pull request title 'Trae/solo agent 3s xmqh' is vague and non-descriptive. It appears to be a branch name rather than a meaningful summary of the changes, providing no context about the substantial additions of documentation, timeout configuration, and long-running operation handling across multiple modules. Replace the branch name with a descriptive title that captures the main focus, such as 'Add timeout optimization specs, configuration, and long-running operation tests' or 'Implement database timeout and backup operation optimization'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/server/services/backupService.ts (1)

1684-1697: ⚠️ Potential issue | 🟡 Minor

Batch insert of siteApiEndpoints may fail without prior cleanup.

The code batch-inserts all siteApiEndpoints from the backup without first deleting existing records. If endpoints with the same IDs already exist, this will cause unique constraint violations.

Unlike sites/accounts/tokens which have upsert logic, this table uses direct insert. Consider adding a delete before the insert, similar to how routeChannels is handled on line 1844.

Suggested fix
     // 处理站点API端点
     if (section.siteApiEndpoints) {
+      await tx.delete(schema.siteApiEndpoints).run();
       await batchInsertHelper(tx, schema.siteApiEndpoints, section.siteApiEndpoints.map((row) => ({
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/services/backupService.ts` around lines 1684 - 1697, Batch
inserting section.siteApiEndpoints with batchInsertHelper into
schema.siteApiEndpoints can violate unique constraints if records already exist;
before the insert, delete existing rows for those endpoints (e.g., using
tx.delete(schema.siteApiEndpoints).where(... id IN
section.siteApiEndpoints.map(r => r.id) ...) or by siteId as appropriate)
similar to how routeChannels is cleaned up, then perform the batch insert with
the mapped fields; ensure you reference tx, schema.siteApiEndpoints,
batchInsertHelper, and section.siteApiEndpoints when making the changes.
🧹 Nitpick comments (11)
CODE_WIKI.md (1)

16-30: Consider adding language specifiers to fenced code blocks.

Several code blocks use plain fenced syntax without language identifiers (e.g., for directory trees and ASCII diagrams). While these are intentionally plain text, adding text or plaintext as the language specifier would satisfy markdown linters and improve consistency.

Also applies to: 71-98, 269-275, 334-350, 354-365, 446-497

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CODE_WIKI.md` around lines 16 - 30, The fenced code blocks in CODE_WIKI.md
(for example the directory tree starting with "metapi/") lack language
specifiers which triggers markdown linters; update each plain fenced block
(including the directory tree and the other ASCII/diagram blocks referenced) to
use a neutral specifier like "text" or "plaintext" (e.g., replace ``` with
```text) so linters accept them while preserving formatting; search for the same
plain fences in the file (the directory-tree block and the other ASCII diagram
blocks) and apply the same change consistently.
src/server/db/longRunningOperations.test.ts (3)

3-3: Unused import.

The schema import is not used in this file.

🧹 Remove unused import
 import { db, closeDbConnections, runtimeDbDialect } from './index.js';
-import * as schema from './schema.js';
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/db/longRunningOperations.test.ts` at line 3, Remove the unused
import of schema (import * as schema from './schema.js') from the top of the
test file; locate the import statement in longRunningOperations.test.ts and
delete it so the file no longer imports schema, then run the linter/tests to
ensure nothing else depends on schema and commit the change.

10-93: Consider extracting dialect-specific DDL to reduce duplication.

The simulateLongRunningOperation function has three nearly identical branches differing only in DDL syntax. Consider extracting the DDL differences into a lookup or helper.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/db/longRunningOperations.test.ts` around lines 10 - 93, The test
duplicates nearly identical logic across runtimeDbDialect branches in
simulateLongRunningOperation; extract dialect-specific DDL (CREATE TABLE
statement and optional ENGINE/serial syntax) into a small lookup or helper
(e.g., a map keyed by runtimeDbDialect that returns createTableSql and
dropTableSql) and then reuse a single code path that calls
db.execute(createTableSql), runs the common batched INSERT loop (using
test_long_running and db.execute for inserts), and finally
db.execute(dropTableSql); reference the symbols runtimeDbDialect,
simulateLongRunningOperation, db.execute, and the test_long_running table when
implementing the lookup and single shared flow.

108-117: Redundant dialect-specific cleanup.

The afterEach block executes identical DROP TABLE IF EXISTS statements for all three dialects. This can be simplified.

♻️ Simplify cleanup
   afterEach(async () => {
     // 清理测试数据
-    if (runtimeDbDialect === 'sqlite') {
-      await db.execute('DROP TABLE IF EXISTS test_long_running');
-    } else if (runtimeDbDialect === 'mysql') {
-      await db.execute('DROP TABLE IF EXISTS test_long_running');
-    } else if (runtimeDbDialect === 'postgres') {
-      await db.execute('DROP TABLE IF EXISTS test_long_running');
-    }
+    await db.execute('DROP TABLE IF EXISTS test_long_running');
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/db/longRunningOperations.test.ts` around lines 108 - 117, The
afterEach block duplicates the same DROP TABLE IF EXISTS statement for each
dialect; simplify by removing the runtimeDbDialect branches and call
db.execute('DROP TABLE IF EXISTS test_long_running') once inside afterEach
(keeping the async/await and error handling pattern) so the cleanup runs
regardless of dialect; update the block around the afterEach, referencing the
existing afterEach, runtimeDbDialect, and db.execute symbols.
src/server/routes/api/settings.ts (1)

1930-1946: Consider migrating WebDAV import to background task for consistency.

The regular backup import endpoint now uses startBackgroundTask for timeout-safe execution, but the WebDAV import endpoint (/api/settings/backup/webdav/import) still performs synchronous import. For large WebDAV backups, this could still hit timeout issues.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/routes/api/settings.ts` around lines 1930 - 1946, The
/api/settings/backup/webdav/import handler runs importBackupFromWebdav
synchronously and can time out; wrap the import logic in startBackgroundTask
instead. Replace the direct call to importBackupFromWebdav() with
startBackgroundTask(() => { const result = await importBackupFromWebdav(); for
(const item of result.appliedSettings) applyImportedSettingToRuntime(item.key,
item.value); if (result.appliedSettings.some(i => i.key ===
'backup_webdav_config_v1')) await reloadBackupWebdavScheduler(); return result;
}) and return the background task response (or task id) immediately so the
long-running work executes safely in the background. Ensure error handling is
moved into the background task and the route returns a 202/started-style
response consistent with other backup import endpoints.
test-long-running-backup.js (2)

173-193: Repeated dynamic imports inside polling loop are inefficient.

The backgroundTaskService.js module is re-imported on every polling iteration. Import once outside the loop to avoid repeated module resolution overhead.

Suggested fix
+    const { getBackgroundTask } = await import('./src/server/services/backgroundTaskService.ts');
+    
     // 等待任务完成
     let attempts = 0;
     const maxAttempts = 300; // 最多等待 10 分钟(300 * 2秒)
     while (attempts < maxAttempts) {
-      const { getBackgroundTask } = await import('./src/server/services/backgroundTaskService.ts');
       const currentTask = getBackgroundTask(task.id);
       
       if (!currentTask) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test-long-running-backup.js` around lines 173 - 193, The polling loop
repeatedly does a dynamic import of
'./src/server/services/backgroundTaskService.ts', which is inefficient; move the
import out of the while loop by importing (or requiring) backgroundTaskService
once before the loop, then call its getBackgroundTask(task.id) inside the loop
(reference: getBackgroundTask and task.id), leaving the rest of the
attempts/await logic unchanged.

195-202: Redundant dynamic import after the loop.

The getBackgroundTask function is imported again after the polling loop, but it should already be available if imported once before the loop as suggested above.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test-long-running-backup.js` around lines 195 - 202, The redundant dynamic
import of getBackgroundTask after the polling loop should be removed; instead
import getBackgroundTask once before the loop (ensure the earlier import
statement for getBackgroundTask is present and used) and then call
getBackgroundTask(task.id) to validate the finalTask; delete the second import
line and rely on the initially imported getBackgroundTask symbol so there are no
duplicate imports.
src/server/services/backupService.longRunning.test.ts (1)

170-180: Infinite loop without internal safeguard.

The while (true) loop relies solely on the Vitest timeout (600000ms). Adding an internal iteration limit (similar to the standalone test script's maxAttempts) would make the test more robust and provide clearer failure messages if the task never completes.

Also, the dynamic import inside the loop is inefficient - consider importing once before the loop.

Suggested fix
+    const { getBackgroundTask } = await import('./backgroundTaskService.js');
+    let attempts = 0;
+    const maxAttempts = 300; // 10 minutes at 2s intervals
     // 等待任务完成
-    while (true) {
-      const currentTask = await import('./backgroundTaskService.js').then(m => m.getBackgroundTask(task.id));
+    while (attempts < maxAttempts) {
+      const currentTask = getBackgroundTask(task.id);
       if (!currentTask) {
         throw new Error('Task not found');
       }
       if (currentTask.status === 'succeeded' || currentTask.status === 'failed') {
         break;
       }
       // 每 2 秒检查一次
       await new Promise(resolve => setTimeout(resolve, 2000));
+      attempts++;
     }
+    if (attempts >= maxAttempts) {
+      throw new Error('Task polling timeout');
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/services/backupService.longRunning.test.ts` around lines 170 -
180, The infinite while loop polling getBackgroundTask(task.id) should be made
robust by importing the backgroundTaskService once before the loop (avoid
dynamic import inside each iteration) and adding an internal maxAttempts counter
(e.g., maxAttempts = 300) with an attempts++ each iteration; if attempts exceeds
maxAttempts throw a clear Error like "Task did not reach terminal state after X
attempts" so the test fails fast and provides a helpful message; keep the
existing 2s sleep and the status checks on currentTask.status === 'succeeded' ||
'failed'.
test-long-running.js (2)

105-107: Unused variable result.

The result of db.execute('SELECT 1') is captured but never used. Since the goal is just to verify the connection is alive, this is fine, but consider removing the variable declaration.

Suggested fix
   try {
-    const result = await db.execute('SELECT 1');
+    await db.execute('SELECT 1');
     console.log('✅ SUCCESS: Database connection is still stable');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test-long-running.js` around lines 105 - 107, The captured but unused
variable "result" from the call to db.execute('SELECT 1') should be removed or
used; update the try block to either await db.execute('SELECT 1') without
assigning to "result" or use the value (e.g., check rows) so the variable is
meaningful—edit the line that declares "const result = await db.execute('SELECT
1')" to one of these options referencing the db.execute call.

38-47: Unsafe SQL string building pattern.

The batch insert uses string interpolation to build SQL values. While data is internally generated here and doesn't pose an immediate SQL injection risk, this pattern is fragile and could become problematic if modified later.

Consider using parameterized queries or the ORM's built-in batch insert capabilities for consistency with production code patterns.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test-long-running.js` around lines 38 - 47, The loop building SQL via string
interpolation in the test (loop using values array and calling db.execute) is
unsafe; replace it with a parameterized batch insert: for each batch in the
for-loop (the block that builds values and calls db.execute), build a parameter
placeholders string (e.g. (?,?...) per row) and a single params array containing
each generated data value, then call db.execute with the SQL and params instead
of embedding values.join(','), or use the ORM/batch-insert helper if available;
update the code paths that construct values and call db.execute to use
placeholders and a params array consistently.
src/server/services/backupService.ts (1)

1638-1641: O(n²) complexity when matching existing sites.

For each site in section.sites, this performs a linear search through existingSites using find(). With large datasets (the test uses 100 sites), this becomes O(n²).

Consider building a lookup map from existingSites beforehand for O(1) lookups.

Suggested optimization
+  // Build lookup map for O(1) access
+  const existingSiteByKey = new Map(existingSites.map(s => [buildSiteIdentityKey(s), s]));
+
   await db.transaction(async (tx) => {
     // 处理站点数据
     for (const site of section.sites) {
       const siteKey = buildSiteIdentityKey(site);
-      const existingSite = existingSites.find(s => buildSiteIdentityKey(s) === siteKey);
+      const existingSite = existingSiteByKey.get(siteKey);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/services/backupService.ts` around lines 1638 - 1641, The loop over
section.sites currently does a linear find on existingSites for each site
causing O(n²); before the loop, create a lookup (e.g., Map<string, SiteType>) by
iterating existingSites and mapping buildSiteIdentityKey(s) -> s, then replace
the existingSites.find(...) in the for (const site of section.sites) loop with a
constant-time lookup like existingSiteMap.get(siteKey); ensure you keep types
intact and update any variable names (existingSite) usages to use the mapped
result.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/configuration.md`:
- Around line 384-403: In the docker-compose.yml example for the metapi service,
replace the incorrect image value "metapi/metapi:latest" with the canonical
image "1467078763/metapi:latest" so the metapi service's image line under the
metapi service matches the other docs (update the image string referenced in the
metapi service block).

In `@src/server/db/longRunningOperations.test.ts`:
- Around line 21-29: The test constructs SQL via string interpolation in the
loop (see db.execute and the INSERT INTO test_long_running block) which risks
SQL injection; change this to use parameterized queries or a batch-insert
helper: build an array of parameter placeholders and a parallel array of values
for each batch (or call a provided batchInsert helper) and call db.execute or
the DB client's parameterized method with the SQL containing placeholders and
the values array instead of embedding values via string concatenation.

In `@src/server/services/backupService.longRunning.test.ts`:
- Line 1: The import line incorrectly pulls join from node:fs; update the import
so mkdtempSync and rmSync remain imported from 'node:fs' while join is imported
from 'node:path' (i.e., keep mkdtempSync and rmSync in the existing import from
node:fs and add/change to import join from node:path) so functions like
mkdtempSync, rmSync, and join used in the backupService.longRunning.test.ts file
resolve at runtime.

In `@src/server/services/backupService.ts`:
- Line 2065: The backup restore currently sets downstreamApiKeyId: null which
severs proxy log → API key linkage; instead, in the restore/import routine
inside backupService.ts that assigns downstreamApiKeyId (search for the code
block where downstreamApiKeyId is being set during import), look up the original
downstream key and map it to the restored/new key ID (e.g. use the existing
apiKey mapping object created during API key restores—apiKeyIdMap /
downstreamKeyIdMap—or add a lookup function getRestoredApiKeyId(originalId)) and
assign that mapped ID to downstreamApiKeyId; if no mapping exists, keep the
fallback of null but emit a warning log that the linkage couldn't be restored so
the limitation is recorded.
- Around line 15-41: The batchQueryAll function paginates by id but doesn't
specify an ORDER BY, causing non-deterministic pagination; update the query
construction inside batchQueryAll (both the initial query and the subsequent
where(query) branch that uses gt(table.id, lastId)) to include an explicit
ordering by asc(table.id) so each batch is fetched in id ascending order (import
asc is already present), ensuring deterministic, non-overlapping batches and
correct lastId advancement.
- Around line 2108-2113: The final assignment overwrites incremental counts on
the stats object (populated earlier in the transaction) with section.*.length
values, losing new vs updated distinctions; remove the three lines that set
stats.newSites = section.sites.length, stats.newAccounts =
section.accounts.length, and stats.newTokens = section.accountTokens.length so
the previously accumulated counts in stats remain intact (locate these in
backupService.ts near the end of the processing block that populates stats
during the transaction).

---

Outside diff comments:
In `@src/server/services/backupService.ts`:
- Around line 1684-1697: Batch inserting section.siteApiEndpoints with
batchInsertHelper into schema.siteApiEndpoints can violate unique constraints if
records already exist; before the insert, delete existing rows for those
endpoints (e.g., using tx.delete(schema.siteApiEndpoints).where(... id IN
section.siteApiEndpoints.map(r => r.id) ...) or by siteId as appropriate)
similar to how routeChannels is cleaned up, then perform the batch insert with
the mapped fields; ensure you reference tx, schema.siteApiEndpoints,
batchInsertHelper, and section.siteApiEndpoints when making the changes.

---

Nitpick comments:
In `@CODE_WIKI.md`:
- Around line 16-30: The fenced code blocks in CODE_WIKI.md (for example the
directory tree starting with "metapi/") lack language specifiers which triggers
markdown linters; update each plain fenced block (including the directory tree
and the other ASCII/diagram blocks referenced) to use a neutral specifier like
"text" or "plaintext" (e.g., replace ``` with ```text) so linters accept them
while preserving formatting; search for the same plain fences in the file (the
directory-tree block and the other ASCII diagram blocks) and apply the same
change consistently.

In `@src/server/db/longRunningOperations.test.ts`:
- Line 3: Remove the unused import of schema (import * as schema from
'./schema.js') from the top of the test file; locate the import statement in
longRunningOperations.test.ts and delete it so the file no longer imports
schema, then run the linter/tests to ensure nothing else depends on schema and
commit the change.
- Around line 10-93: The test duplicates nearly identical logic across
runtimeDbDialect branches in simulateLongRunningOperation; extract
dialect-specific DDL (CREATE TABLE statement and optional ENGINE/serial syntax)
into a small lookup or helper (e.g., a map keyed by runtimeDbDialect that
returns createTableSql and dropTableSql) and then reuse a single code path that
calls db.execute(createTableSql), runs the common batched INSERT loop (using
test_long_running and db.execute for inserts), and finally
db.execute(dropTableSql); reference the symbols runtimeDbDialect,
simulateLongRunningOperation, db.execute, and the test_long_running table when
implementing the lookup and single shared flow.
- Around line 108-117: The afterEach block duplicates the same DROP TABLE IF
EXISTS statement for each dialect; simplify by removing the runtimeDbDialect
branches and call db.execute('DROP TABLE IF EXISTS test_long_running') once
inside afterEach (keeping the async/await and error handling pattern) so the
cleanup runs regardless of dialect; update the block around the afterEach,
referencing the existing afterEach, runtimeDbDialect, and db.execute symbols.

In `@src/server/routes/api/settings.ts`:
- Around line 1930-1946: The /api/settings/backup/webdav/import handler runs
importBackupFromWebdav synchronously and can time out; wrap the import logic in
startBackgroundTask instead. Replace the direct call to importBackupFromWebdav()
with startBackgroundTask(() => { const result = await importBackupFromWebdav();
for (const item of result.appliedSettings)
applyImportedSettingToRuntime(item.key, item.value); if
(result.appliedSettings.some(i => i.key === 'backup_webdav_config_v1')) await
reloadBackupWebdavScheduler(); return result; }) and return the background task
response (or task id) immediately so the long-running work executes safely in
the background. Ensure error handling is moved into the background task and the
route returns a 202/started-style response consistent with other backup import
endpoints.

In `@src/server/services/backupService.longRunning.test.ts`:
- Around line 170-180: The infinite while loop polling
getBackgroundTask(task.id) should be made robust by importing the
backgroundTaskService once before the loop (avoid dynamic import inside each
iteration) and adding an internal maxAttempts counter (e.g., maxAttempts = 300)
with an attempts++ each iteration; if attempts exceeds maxAttempts throw a clear
Error like "Task did not reach terminal state after X attempts" so the test
fails fast and provides a helpful message; keep the existing 2s sleep and the
status checks on currentTask.status === 'succeeded' || 'failed'.

In `@src/server/services/backupService.ts`:
- Around line 1638-1641: The loop over section.sites currently does a linear
find on existingSites for each site causing O(n²); before the loop, create a
lookup (e.g., Map<string, SiteType>) by iterating existingSites and mapping
buildSiteIdentityKey(s) -> s, then replace the existingSites.find(...) in the
for (const site of section.sites) loop with a constant-time lookup like
existingSiteMap.get(siteKey); ensure you keep types intact and update any
variable names (existingSite) usages to use the mapped result.

In `@test-long-running-backup.js`:
- Around line 173-193: The polling loop repeatedly does a dynamic import of
'./src/server/services/backgroundTaskService.ts', which is inefficient; move the
import out of the while loop by importing (or requiring) backgroundTaskService
once before the loop, then call its getBackgroundTask(task.id) inside the loop
(reference: getBackgroundTask and task.id), leaving the rest of the
attempts/await logic unchanged.
- Around line 195-202: The redundant dynamic import of getBackgroundTask after
the polling loop should be removed; instead import getBackgroundTask once before
the loop (ensure the earlier import statement for getBackgroundTask is present
and used) and then call getBackgroundTask(task.id) to validate the finalTask;
delete the second import line and rely on the initially imported
getBackgroundTask symbol so there are no duplicate imports.

In `@test-long-running.js`:
- Around line 105-107: The captured but unused variable "result" from the call
to db.execute('SELECT 1') should be removed or used; update the try block to
either await db.execute('SELECT 1') without assigning to "result" or use the
value (e.g., check rows) so the variable is meaningful—edit the line that
declares "const result = await db.execute('SELECT 1')" to one of these options
referencing the db.execute call.
- Around line 38-47: The loop building SQL via string interpolation in the test
(loop using values array and calling db.execute) is unsafe; replace it with a
parameterized batch insert: for each batch in the for-loop (the block that
builds values and calls db.execute), build a parameter placeholders string (e.g.
(?,?...) per row) and a single params array containing each generated data
value, then call db.execute with the SQL and params instead of embedding
values.join(','), or use the ORM/batch-insert helper if available; update the
code paths that construct values and call db.execute to use placeholders and a
params array consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e2b5b425-c424-4743-bbbd-9cd404fc3bfe

📥 Commits

Reviewing files that changed from the base of the PR and between 5e27684 and b9a081a.

📒 Files selected for processing (20)
  • .monkeycode/specs/db-batch-optimization/tasklist.md
  • .trae/specs/database-timeout-optimization/checklist.md
  • .trae/specs/database-timeout-optimization/spec.md
  • .trae/specs/database-timeout-optimization/tasks.md
  • .trae/specs/timeout-optimization/checklist.md
  • .trae/specs/timeout-optimization/spec.md
  • .trae/specs/timeout-optimization/tasks.md
  • CODE_WIKI.md
  • LONG_RUNNING_OPERATIONS_TEST_PLAN.md
  • docs/configuration.md
  • docs/operations.md
  • src/server/config.ts
  • src/server/db/index.ts
  • src/server/db/longRunningOperations.test.ts
  • src/server/routes/api/settings.ts
  • src/server/services/backupService.longRunning.test.ts
  • src/server/services/backupService.ts
  • src/web/api.ts
  • test-long-running-backup.js
  • test-long-running.js

Comment thread docs/configuration.md
Comment on lines +384 to +403
```yaml
# docker-compose.yml 示例
services:
metapi:
image: metapi/metapi:latest
restart: unless-stopped
environment:
- PROXY_FIRST_BYTE_TIMEOUT_SEC=60
- TOKEN_ROUTER_FAILURE_COOLDOWN_MAX_SEC=86400
- MODEL_AVAILABILITY_PROBE_TIMEOUT_MS=15000
# 网络超时设置
network_mode: bridge
# 健康检查超时
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:4000/api/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check Docker image references across documentation
rg -n "metapi.*:latest" --type md

Repository: cita-777/metapi

Length of output: 595


Fix Docker image name in Docker Compose example.

Line 388 uses metapi/metapi:latest, but this is inconsistent with all other documentation files (getting-started.md, deployment.md, README.md, CODE_WIKI.md, etc.) which reference 1467078763/metapi:latest. Update the image name to 1467078763/metapi:latest to match the canonical reference.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/configuration.md` around lines 384 - 403, In the docker-compose.yml
example for the metapi service, replace the incorrect image value
"metapi/metapi:latest" with the canonical image "1467078763/metapi:latest" so
the metapi service's image line under the metapi service matches the other docs
(update the image string referenced in the metapi service block).

Comment on lines +21 to +29
for (let i = 0; i < 50; i++) {
const values = [];
for (let j = 0; j < 1000; j++) {
const data = `test_data_${i}_${j}_${Math.random()}`;
values.push(`('${data}')`);
}
await db.execute(`
INSERT INTO test_long_running (data) VALUES ${values.join(', ')}
`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid string interpolation in SQL queries.

The test constructs SQL by string interpolation. While Math.random() produces safe numeric output, this pattern is risky and sets a bad precedent. If future modifications introduce user-controlled data, it becomes an injection vector.

Consider using parameterized queries or explicitly sanitizing values.

🛡️ Suggested safer pattern
     for (let j = 0; j < 1000; j++) {
-      const data = `test_data_${i}_${j}_${Math.random()}`;
-      values.push(`('${data}')`);
+      const data = `test_data_${i}_${j}_${Math.random()}`.replace(/'/g, "''");
+      values.push(`('${data}')`);
     }

Or better yet, use a batch insert helper that handles parameterization.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for (let i = 0; i < 50; i++) {
const values = [];
for (let j = 0; j < 1000; j++) {
const data = `test_data_${i}_${j}_${Math.random()}`;
values.push(`('${data}')`);
}
await db.execute(`
INSERT INTO test_long_running (data) VALUES ${values.join(', ')}
`);
for (let i = 0; i < 50; i++) {
const values = [];
for (let j = 0; j < 1000; j++) {
const data = `test_data_${i}_${j}_${Math.random()}`.replace(/'/g, "''");
values.push(`('${data}')`);
}
await db.execute(`
INSERT INTO test_long_running (data) VALUES ${values.join(', ')}
`);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/db/longRunningOperations.test.ts` around lines 21 - 29, The test
constructs SQL via string interpolation in the loop (see db.execute and the
INSERT INTO test_long_running block) which risks SQL injection; change this to
use parameterized queries or a batch-insert helper: build an array of parameter
placeholders and a parallel array of values for each batch (or call a provided
batchInsert helper) and call db.execute or the DB client's parameterized method
with the SQL containing placeholders and the values array instead of embedding
values via string concatenation.

@@ -0,0 +1,265 @@
import { mkdtempSync, rmSync, join } from 'node:fs';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Incorrect import: join is not exported from node:fs.

The join function should be imported from node:path, not node:fs. This will cause a runtime error when the test runs.

Proposed fix
-import { mkdtempSync, rmSync, join } from 'node:fs';
+import { mkdtempSync, rmSync } from 'node:fs';
+import { join } from 'node:path';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { mkdtempSync, rmSync, join } from 'node:fs';
import { mkdtempSync, rmSync } from 'node:fs';
import { join } from 'node:path';
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/services/backupService.longRunning.test.ts` at line 1, The import
line incorrectly pulls join from node:fs; update the import so mkdtempSync and
rmSync remain imported from 'node:fs' while join is imported from 'node:path'
(i.e., keep mkdtempSync and rmSync in the existing import from node:fs and
add/change to import join from node:path) so functions like mkdtempSync, rmSync,
and join used in the backupService.longRunning.test.ts file resolve at runtime.

Comment on lines +15 to +41
async function batchQueryAll(
table: any,
dbInstance: typeof db,
batchSize: number = DEFAULT_BATCH_SIZE,
): Promise<any[]> {
const results: any[] = [];
let lastId: number | undefined = undefined;
let hasMore = true;

while (hasMore) {
const query = lastId === undefined
? dbInstance.select().from(table)
: dbInstance.select().from(table).where(gt(table.id, lastId));

const batch = await query.limit(batchSize).all();

if (batch.length === 0) {
hasMore = false;
} else {
results.push(...batch);
lastId = batch[batch.length - 1].id;
hasMore = batch.length === batchSize;
}
}

return results;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing ORDER BY in batch query may cause non-deterministic pagination.

The batchQueryAll helper uses where(gt(table.id, lastId)) to paginate, but without an explicit ORDER BY id ASC, the database may return rows in an undefined order. This could cause rows to be missed or duplicated across batches.

Proposed fix
   while (hasMore) {
     const query = lastId === undefined
-      ? dbInstance.select().from(table)
-      : dbInstance.select().from(table).where(gt(table.id, lastId));
+      ? dbInstance.select().from(table).orderBy(asc(table.id))
+      : dbInstance.select().from(table).where(gt(table.id, lastId)).orderBy(asc(table.id));

     const batch = await query.limit(batchSize).all();

Note: You'll need to import asc at the top (it's already imported on line 1).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function batchQueryAll(
table: any,
dbInstance: typeof db,
batchSize: number = DEFAULT_BATCH_SIZE,
): Promise<any[]> {
const results: any[] = [];
let lastId: number | undefined = undefined;
let hasMore = true;
while (hasMore) {
const query = lastId === undefined
? dbInstance.select().from(table)
: dbInstance.select().from(table).where(gt(table.id, lastId));
const batch = await query.limit(batchSize).all();
if (batch.length === 0) {
hasMore = false;
} else {
results.push(...batch);
lastId = batch[batch.length - 1].id;
hasMore = batch.length === batchSize;
}
}
return results;
}
async function batchQueryAll(
table: any,
dbInstance: typeof db,
batchSize: number = DEFAULT_BATCH_SIZE,
): Promise<any[]> {
const results: any[] = [];
let lastId: number | undefined = undefined;
let hasMore = true;
while (hasMore) {
const query = lastId === undefined
? dbInstance.select().from(table).orderBy(asc(table.id))
: dbInstance.select().from(table).where(gt(table.id, lastId)).orderBy(asc(table.id));
const batch = await query.limit(batchSize).all();
if (batch.length === 0) {
hasMore = false;
} else {
results.push(...batch);
lastId = batch[batch.length - 1].id;
hasMore = batch.length === batchSize;
}
}
return results;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/services/backupService.ts` around lines 15 - 41, The batchQueryAll
function paginates by id but doesn't specify an ORDER BY, causing
non-deterministic pagination; update the query construction inside batchQueryAll
(both the initial query and the subsequent where(query) branch that uses
gt(table.id, lastId)) to include an explicit ordering by asc(table.id) so each
batch is fetched in id ascending order (import asc is already present), ensuring
deterministic, non-overlapping batches and correct lastId advancement.

channelId,
accountId,
downstreamApiKeyId,
downstreamApiKeyId: null, // 简化处理,暂时设为null
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Setting downstreamApiKeyId to null loses proxy log linkage.

The comment indicates this is "simplified handling," but setting all downstreamApiKeyId values to null during import will break the association between proxy logs and their downstream API keys. This could impact analytics, debugging, and cost tracking after a backup restore.

Consider restoring the original mapping logic that was presumably present before this change, or document this as a known limitation in the import summary.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/services/backupService.ts` at line 2065, The backup restore
currently sets downstreamApiKeyId: null which severs proxy log → API key
linkage; instead, in the restore/import routine inside backupService.ts that
assigns downstreamApiKeyId (search for the code block where downstreamApiKeyId
is being set during import), look up the original downstream key and map it to
the restored/new key ID (e.g. use the existing apiKey mapping object created
during API key restores—apiKeyIdMap / downstreamKeyIdMap—or add a lookup
function getRestoredApiKeyId(originalId)) and assign that mapped ID to
downstreamApiKeyId; if no mapping exists, keep the fallback of null but emit a
warning log that the linkage couldn't be restored so the limitation is recorded.

Comment on lines +2108 to +2113
// 计算统计数据
stats.newSites = section.sites.length;
stats.newAccounts = section.accounts.length;
stats.newTokens = section.accountTokens.length;

return stats;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stats are overwritten at the end, discarding incremental counts.

The stats object is populated incrementally during the transaction (lines 1659, 1679, 1730, 1757, 1776, 1792), but then overwritten here with section.*.length values. This discards the distinction between new vs. updated entities that was tracked during the loop.

Proposed fix

Remove the overwriting lines to preserve the accurate incremental stats:

   console.log('[backup] importAccountsSection completed in', Date.now() - startTime, 'ms');
   
-  // 计算统计数据
-  stats.newSites = section.sites.length;
-  stats.newAccounts = section.accounts.length;
-  stats.newTokens = section.accountTokens.length;
-  
   return stats;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// 计算统计数据
stats.newSites = section.sites.length;
stats.newAccounts = section.accounts.length;
stats.newTokens = section.accountTokens.length;
return stats;
return stats;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/services/backupService.ts` around lines 2108 - 2113, The final
assignment overwrites incremental counts on the stats object (populated earlier
in the transaction) with section.*.length values, losing new vs updated
distinctions; remove the three lines that set stats.newSites =
section.sites.length, stats.newAccounts = section.accounts.length, and
stats.newTokens = section.accountTokens.length so the previously accumulated
counts in stats remain intact (locate these in backupService.ts near the end of
the processing block that populates stats during the transaction).

@cita-777 cita-777 closed this May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: db Database and schema related changes area: docs Docs and README changes area: web Web UI changes size: XXL 2000 or more lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants