feat(sites): 站点级模型探测 —— 实时日志、延迟阈值与自动禁用 by Babylonehy · Pull Request #510 · cita-777/metapi

Babylonehy · 2026-04-23T14:28:37Z

PR

Title

feat(sites): 站点级模型探测 —— 实时日志、延迟阈值与自动禁用

Description

问题 / Problem

部分站点api中包含大量失效模型，导致路由庞大，拉取models时间长，因此增加模型探测和自动禁用。探测只能在模型刷新时被动触发，缺少手动立即探测的入口；探测过程完全黑盒，结果只有成功/失败，无法看到具体原因和实时进度。

方案 / Solution

将探测功能完整迁移到站点编辑弹窗，实现按站点独立配置并新增手动探测能力：

后端：

新增 POST /api/sites/:id/probe-now：一次性 JSON 探测接口
新增 GET /api/sites/:id/probe-stream：SSE 流式探测接口，每个模型结果实时推送
- 支持 scope（single/all）、modelName、latencyThresholdMs 查询参数
探测并发：worker-pool 模式，默认 10 并发（可通过 options.concurrency 覆盖）
延迟阈值：supported 结果若响应时间超过阈值，覆盖为 unsupported 并自动禁用
inconclusive（网络异常、超时等）统一降级为 unsupported，触发自动禁用
失败原因透传至前端（reason 字段），含超时、无 Token、无权限、模型不存在等
从全局 config.ts 和 settings.ts 移除三个探测配置字段

前端：

站点编辑弹窗新增「刷新后自动测试请求」面板（原全局设置页已移除）
- 开关、探测范围（指定/全部）、探测模型名、延迟阈值输入框
「立即探测」按钮，点击后以 fetch + ReadableStream 消费 SSE（规避 EventSource 不支持自定义 header 的问题）
「停止」按钮（探测进行中显示）：AbortController.abort() 终止请求，日志追加「已手动停止」，已收到结果保留
实时日志面板：每个模型结果逐条追加（时间戳 + 状态 + 延迟 + 原因）
探测完成后自动刷新 availableModels / disabledModels，并在日志下方展示模型状态列表（绿色=可用，红色=已禁用）

测试 / Tests

TypeScript 编译通过（tsconfig.web.json + tsconfig.server.json 零错误）
本地 dev server 启动验证页面正常加载
手动验证 SSE 流式日志、停止按钮、延迟阈值自动禁用全流程

不变 / What Does NOT Change

定时批量测活逻辑（modelAvailabilityProbeService）不受影响
全局「批量测活并发数」配置仍对定时探测生效
单个路由的启用/禁用开关行为不变
无新增环境变量

影响文件 / Changed Files

src/server/db/schema.ts — sites 表新增 postRefreshProbeEnabled/Model/Scope 字段
src/server/routes/api/sites.ts — 新增 probe-now 和 probe-stream 两个端点；PUT 处理新字段
src/server/routes/api/settings.ts — 移除探测相关配置项
src/server/services/modelService.ts — 新增 probeSiteModels() 导出函数（并发 worker-pool、延迟阈值、进度回调）
src/server/services/backupService.ts — 补齐站点对象字面量的新字段默认值
src/web/api.ts — 新增 probeSiteNow()，移除全局探测配置字段
src/web/pages/Sites.tsx — 站点编辑弹窗新增探测配置 UI、实时日志面板、探测后模型状态列表

Summary by CodeRabbit

New Features
- Per-site post-refresh model probing with manual run, real-time streamed logs, and optional latency threshold.
UI
- Site edit modal: enable/disable probing, scope (single/all), target model, latency threshold, start/stop probe, view live logs and refreshed model availability.
API
- New endpoints to run a probe now (one-shot) and to stream probe progress; client method added.
Model Refresh
- Optional post-refresh probes included in refresh results and used to update model availability.
Backup/Import
- Backup export/import now preserves probe settings.
Database
- Migrations add four new site-level probe fields.

- Move post-refresh probe settings from global config to per-site (sites table) - Add manual 'probe now' button in site editor with real-time SSE log stream - Use fetch+ReadableStream instead of EventSource to support Authorization header - Concurrent probe execution (default 10 workers) via worker-pool pattern - Add stop button to cancel in-flight probe while preserving completed results - Add latency threshold: auto-disable models exceeding response time limit - Treat inconclusive probe results as unsupported (auto-disable) - Show specific failure reason in logs (timeout, no token, model not found, etc.) - Refresh and display post-probe model status list (green=available, red=disabled) - Remove probe settings from global Settings page and RuntimeSettingsPayload

chatgpt-codex-connector · 2026-04-23T14:28:43Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

coderabbitai · 2026-04-23T14:32:38Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds per‑site post‑refresh probing: DB migrations and schema fields, backup import/export propagation, server endpoints (one‑shot + SSE), modelService probing + availability updates, and UI controls/streaming logs for manual probes.

Changes

Cohort / File(s)	Summary
Database Schema & Migrations `src/server/db/schema.ts`, `drizzle/0025_site_post_refresh_probe.sql`, `drizzle/0026_site_probe_latency_threshold.sql`, `drizzle/meta/_journal.json`	Add `postRefreshProbeEnabled`, `postRefreshProbeModel`, `postRefreshProbeScope`, `postRefreshProbeLatencyThresholdMs` columns and migration/journal entries.
API Routes & Helpers `src/server/routes/api/sites.ts`, `src/server/routes/api/settings.ts`	Add `POST /api/sites/:id/probe-now` and `GET /api/sites/:id/probe-stream` (SSE helper `sseWrite`); extend `PUT /api/sites/:id` to accept/normalize probe fields; have settings import ignore probe keys.
Model Probing Logic `src/server/services/modelService.ts`	Add post‑refresh probe step, exported `probeSiteModels` with progress/result types, latency threshold logic, model disabling persistence, runtime health updates, and token route rebuild triggers.
Backup Import/Export `src/server/services/backupService.ts`	Propagate four new probe fields in backup export/import with defaults and scope coercion.
Schema Introspection `src/server/db/schemaIntrospection.ts`	Preserve empty-string defaults for text/json columns when introspecting MySQL `information_schema`.
Web client & UI `src/web/api.ts`, `src/web/pages/Sites.tsx`	Add `api.probeSiteNow(...)`; Sites UI: probe settings, manual probe runner, SSE log parsing, abort support, latency‑threshold field, and refresh of model availability after probing.

Sequence Diagram

sequenceDiagram
    participant UI as "Web UI (Sites)"
    participant API as "API Server (sites routes)"
    participant ModelSvc as "Model Service"
    participant Runtime as "Runtime Models"
    participant DB as "Database"
    participant TokenRoutes as "Token Routes"

    UI->>API: POST /api/sites/:id/probe-now (or open /probe-stream)
    API->>ModelSvc: probeSiteModels(siteId, options, onProgress)
    ModelSvc->>Runtime: Probe selected models (concurrent)
    Runtime-->>ModelSvc: Return probe results (status, latency)
    ModelSvc->>DB: Persist availability changes / record disables
    DB-->>ModelSvc: Acknowledge persistence
    ModelSvc->>TokenRoutes: rebuildTokenRoutesFromAvailability()
    TokenRoutes-->>ModelSvc: Rebuild complete
    ModelSvc-->>API: Emit progress/final result
    API-->>UI: JSON response or SSE events (start/model/action/complete/error)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Codex/model refresh health #50: Overlaps changes to model discovery/refresh result shapes and post‑probe handling in modelService.ts.
fix: refine config backup import semantics #232: Related to backup import/export changes that touch the same backupService propagation logic.
feat: 新站点创建后显示选择对话框 #302: Touches applyImportedSettingToRuntime in src/server/routes/api/settings.ts, related to ignoring/importing probe keys.

Suggested labels

area: server, size: XL

Poem

🐰 I hopped through schema, stream, and test,
Poked each model, noted slow and best,
Sent a ping, recorded time and cheer,
Marked the weak, let the swift appear —
A tiny rabbit's probe — all clear.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title is in Chinese and describes the main feature: site-level model probing with real-time logs, latency thresholds, and automatic disabling.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 9

🧹 Nitpick comments (1)

src/server/services/modelService.ts (1)
533-548: runPostRefreshProbeIfEnabled 采用串行探测，可能显著拉长模型刷新耗时。

该函数在模型刷新成功路径的收尾处被 await（如 Line 735-739、821-825、921-925、1007-1011、1270-1274）。当站点设置 postRefreshProbeScope='all' 且发现模型较多时，每个模型最坏会跑满 config.modelAvailabilityProbeTimeoutMs；一个定时批量刷新里所有账号的刷新会被这段串行探测累加，可能把 refreshModelsAndRebuildRoutes 从几秒拖到几分钟。

同时 probeSiteModels（Line 442-477）已经实现了带并发上限的 worker-pool，这里与之几乎是逐行重复。建议：

把 worker-pool + latencyThresholdMs + unsupported/inconclusive 处理抽成共享工具（例如 runSiteProbeBatch(site, account, modelsToProbe, options)），被 probeSiteModels 和 runPostRefreshProbeIfEnabled 复用，减少分叉。

在 runPostRefreshProbeIfEnabled 中至少使用与 probeSiteModels 相同的并发（默认 10），避免 refresh 路径上的串行阻塞。

顺带让自动刷新路径也能接入延迟阈值（见 Sites.tsx 上另一条评论）。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/services/modelService.ts` around lines 533 - 548,
runPostRefreshProbeIfEnabled currently probes models sequentially via
probeRuntimeModel, which can hugely delay refresh when
postRefreshProbeScope='all' and many models hit
config.modelAvailabilityProbeTimeoutMs; extract the worker-pool +
latencyThresholdMs + unsupported/inconclusive handling into a shared helper
(e.g., runSiteProbeBatch(site, account, modelsToProbe, options)) reusing the
logic in probeSiteModels, then modify runPostRefreshProbeIfEnabled to call that
helper with the same concurrency limit (default 10) instead of the serial
for-loop and ensure the helper enforces latencyThresholdMs and marks
unsupported/inconclusive results consistently.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/server/db/schema.ts`:
- Around line 18-20: The three new schema columns (postRefreshProbeEnabled,
postRefreshProbeModel, postRefreshProbeScope) were added to the Drizzle schema
but the migration artifacts and SQL patches were not regenerated; run Drizzle's
migration tooling to generate a new migration that includes these columns,
update the snapshot and journal (e.g., 0024_snapshot.json and _journal.json) and
regenerate the checked-in SQL patches (mysql.bootstrap.sql, mysql.upgrade.sql,
postgres.bootstrap.sql, postgres.upgrade.sql) so the schema, snapshot, and SQL
outputs stay in sync.

In `@src/server/routes/api/sites.ts`:
- Around line 23-25: The SSE helper sseWrite currently swallows write errors and
the route keeps running probeSiteModels even after the client disconnects;
change the route to create an AbortController, listen for request.raw
'close'/'finish'/'error' (or check raw.destroyed/writableEnded) and call
controller.abort() when the stream closes, pass controller.signal into
probeSiteModels (and any downstream service/worker pool APIs) so work cancels
promptly, and update sseWrite to early-return if the response is
closed/destroyed to avoid ignoring disconnects; ensure all callers in this route
accept and forward the AbortSignal rather than letting the route own
cancellation logic.
- Around line 905-908: The endpoint builds options for probeSiteModels but omits
latencyThresholdMs, so update the handler to read latencyThresholdMs from body
(e.g., const latencyThresholdMs = typeof body?.latencyThresholdMs === 'number'
&& Number.isFinite(body.latencyThresholdMs) && body.latencyThresholdMs > 0 ?
body.latencyThresholdMs : undefined) and pass it into probeSiteModels(id, {
scope, modelName, latencyThresholdMs }); ensure you reference the existing
variables body, scope, modelName and the probeSiteModels call and validate the
value (reject NaN/negative) so only a valid number is forwarded.
- Around line 691-694: The current logic in the route uses loose coercion
(Boolean(anyBody.postRefreshProbeEnabled) and forcing postRefreshProbeScope to
'single' for any non-'all' value), which accepts invalid inputs; update the
handler that reads anyBody to explicitly validate inputs: for
postRefreshProbeEnabled accept only true/false booleans or the string
'true'/'false' (reuse the existing boolean normalizer used elsewhere), for
postRefreshProbeModel trim and accept only non-empty strings, and for
postRefreshProbeScope accept only the allowed values 'all' or 'single' (reject
or return a 400 for anything else) before assigning to
updates.postRefreshProbeEnabled, updates.postRefreshProbeModel and
updates.postRefreshProbeScope so invalid values are rejected rather than
silently coerced.

In `@src/server/services/backupService.ts`:
- Around line 761-763: The importAccountsSection() function currently ignores
the postRefreshProbeEnabled, postRefreshProbeModel, and postRefreshProbeScope
fields when inserting into schema.sites, causing loss of probe configuration
when restoring backups. To fix this, update importAccountsSection() to include
these postRefreshProbe* fields from the backup data into the site records during
insertion, ensuring the probe settings persist after import.

In `@src/server/services/modelService.ts`:
- Around line 482-506: The current code treats status === 'inconclusive' the
same as 'unsupported' and auto-disables models (updating
modelAvailability.available=false, inserting into siteDisabledModels, calling
setAccountRuntimeHealth and rebuildTokenRoutesFromAvailability). Change the
logic in the unsupportedModels block so only status === 'unsupported' (and
optionally 'latencyExceeded' if you choose) are auto-disabled and inserted into
schema.siteDisabledModels; for entries with status === 'inconclusive' instead
emit a progress/warning (onProgress?.({ type: 'inconclusive', modelName, reason
})) and optionally update schema.modelAvailability.available=false without
inserting into siteDisabledModels or marking account unhealthy via
setAccountRuntimeHealth; apply the same correction to
runPostRefreshProbeIfEnabled to avoid permanent disabling from
transient/inconclusive probe results.

In `@src/web/api.ts`:
- Around line 789-790: The probeSiteNow API wrapper (probeSiteNow) must accept
and forward a latencyThresholdMs value so one-shot probes can use the same
slow-model auto-disable behavior; update the options type to include
latencyThresholdMs?: number, ensure the call to
request(`/api/sites/${siteId}/probe-now`, ...) includes that property in the
JSON body (i.e., JSON.stringify(options || {}) will contain latencyThresholdMs
when provided), and adjust any callers to pass the threshold as needed.

In `@src/web/pages/Sites.tsx`:
- Around line 558-577: The probe latency threshold isn't persisted or used for
automatic post-refresh probes; update the component and service to persist and
consume a new field (postRefreshProbeLatencyThresholdMs): add that field to the
payload in handleSaveProbeSettings and to the setSites update, initialize/reset
the local probeLatencyThreshold state in openEdit from
site.postRefreshProbeLatencyThresholdMs, and extend runPostRefreshProbeIfEnabled
(modelService.ts) to read/accept postRefreshProbeLatencyThresholdMs from the
site record so automatic probes apply the saved threshold (or alternatively
clearly separate and label a manual-only threshold if you choose the other
approach).
- Around line 586-700: handleProbeNow clears probeLog but doesn't reset
probeCompleted, and the 'error' SSE branch doesn't refresh model lists or set
probeCompleted; update handleProbeNow to call setProbeCompleted(false) when
starting a new probe, and inside the SSE 'error' handling (the branch that
checks type === 'error' in handleProbeNow) perform the same model-list refresh
Promise.all(...) sequence that you use in the 'complete' branch (calling
api.getSiteAvailableModels / setAvailableModels and api.getSiteDisabledModels /
setDisabledModels) and ensure you call setProbeCompleted(true) in finally/after
the refresh (or set it on error path) so the UI no longer shows stale "探测完成"
state.

---

Nitpick comments:
In `@src/server/services/modelService.ts`:
- Around line 533-548: runPostRefreshProbeIfEnabled currently probes models
sequentially via probeRuntimeModel, which can hugely delay refresh when
postRefreshProbeScope='all' and many models hit
config.modelAvailabilityProbeTimeoutMs; extract the worker-pool +
latencyThresholdMs + unsupported/inconclusive handling into a shared helper
(e.g., runSiteProbeBatch(site, account, modelsToProbe, options)) reusing the
logic in probeSiteModels, then modify runPostRefreshProbeIfEnabled to call that
helper with the same concurrency limit (default 10) instead of the serial
for-loop and ensure the helper enforces latencyThresholdMs and marks
unsupported/inconclusive results consistently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c848a233-210d-4a4f-814c-57eeaa3d9872

📥 Commits

Reviewing files that changed from the base of the PR and between 262651a and 46f23dd.

📒 Files selected for processing (7)

src/server/db/schema.ts
src/server/routes/api/settings.ts
src/server/routes/api/sites.ts
src/server/services/backupService.ts
src/server/services/modelService.ts
src/web/api.ts
src/web/pages/Sites.tsx

coderabbitai · 2026-04-23T14:38:20Z

+  postRefreshProbeEnabled: integer('post_refresh_probe_enabled', { mode: 'boolean' }).default(false),
+  postRefreshProbeModel: text('post_refresh_probe_model').default(''),
+  postRefreshProbeScope: text('post_refresh_probe_scope').default('single'),


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail printf 'All references to the new site probe columns:\n' rg -n 'post_refresh_probe_(enabled|model|scope)' || true printf '\nMigration/schema artifact files containing the new columns:\n' fd -i '(migration|snapshot|journal|schema)' --type f -x sh -c ' file="$1" if rg -q "post_refresh_probe_(enabled|model|scope)" "$file"; then printf "\n-- %s --\n" "$file" rg -n "post_refresh_probe_(enabled|model|scope)" "$file" fi ' sh {}

Repository: cita-777/metapi

Length of output: 1015

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find migration-related directories echo "=== Migration and artifact directory structure ===" fd -i '(migration|migrations|snapshot|artifact)' -type d | head -20 # Check package.json for drift-check script echo -e "\n=== Checking for drift-check script ===" grep -A2 -B2 'drift-check' package.json 2>/dev/null || echo "Not found in package.json" # Look for drizzle migration files echo -e "\n=== Looking for Drizzle-related files ===" fd '\.sql$' --type f | head -20 # Check for schema snapshot or journal files echo -e "\n=== Schema artifacts and snapshots ===" fd -e '.snapshot' -o -e '.json' -o -e '.sql' --path '*schema*' --path '*artifact*' --type f | head -20

Repository: cita-777/metapi

Length of output: 297

🏁 Script executed:

#!/bin/bash set -euo pipefail # List all files in common Drizzle migration locations echo "=== Checking drizzle/migrations directory ===" if [ -d "drizzle/migrations" ]; then ls -la drizzle/migrations/ | head -30 echo "---" ls -1 drizzle/migrations/*.sql 2>/dev/null || echo "No SQL migration files found" fi # Check meta directory echo -e "\n=== Checking drizzle/meta directory ===" if [ -d "drizzle/meta" ]; then ls -la drizzle/meta/ | head -30 echo "---" find drizzle/meta -type f | head -20 fi # Check db directory structure echo -e "\n=== Checking src/server/db directory ===" find src/server/db -type f -name '*.sql' -o -name '*migration*' -o -name '*snapshot*' 2>/dev/null | head -20

Repository: cita-777/metapi

Length of output: 1700

🏁 Script executed:

#!/bin/bash set -euo pipefail # Check the latest snapshot for the new columns echo "=== Checking 0024_snapshot.json for new columns ===" rg 'post_refresh_probe_(enabled|model|scope)' drizzle/meta/0024_snapshot.json || echo "NOT FOUND in 0024_snapshot.json" # Check the journal echo -e "\n=== Checking _journal.json ===" rg 'post_refresh_probe_(enabled|model|scope)' drizzle/meta/_journal.json || echo "NOT FOUND in _journal.json" # Check generated SQL files echo -e "\n=== Checking generated SQL files ===" for file in src/server/db/generated/*.sql; do echo "--- $file ---" if rg -q 'post_refresh_probe_(enabled|model|scope)' "$file"; then echo "FOUND" rg -n 'post_refresh_probe_(enabled|model|scope)' "$file" | head -5 else echo "NOT FOUND" fi done

Repository: cita-777/metapi

Length of output: 489

🏁 Script executed:

#!/bin/bash set -euo pipefail # Run drift-check if available echo "=== Running npm run repo:drift-check ===" if grep -q 'drift-check' package.json; then npm run repo:drift-check 2>&1 || echo "Drift check encountered issues" else echo "No drift-check script found in package.json" fi

Repository: cita-777/metapi

Length of output: 1354

Regenerate Drizzle migration artifacts and SQL patches for the three new schema columns.

The schema definition in src/server/db/schema.ts was updated with the three new columns, but the corresponding migration snapshot and generated SQL patches were not regenerated. The new columns are absent from:

drizzle/meta/0024_snapshot.json (latest snapshot)

drizzle/meta/_journal.json

src/server/db/generated/mysql.bootstrap.sql, mysql.upgrade.sql, postgres.bootstrap.sql, postgres.upgrade.sql

Database upgrades will fail without these artifacts. Regenerate them using Drizzle's migration tooling to ensure all three outputs stay synchronized (Drizzle schema + SQLite migration history + checked-in SQL patches).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/server/db/schema.ts` around lines 18 - 20, The three new schema columns (postRefreshProbeEnabled, postRefreshProbeModel, postRefreshProbeScope) were added to the Drizzle schema but the migration artifacts and SQL patches were not regenerated; run Drizzle's migration tooling to generate a new migration that includes these columns, update the snapshot and journal (e.g., 0024_snapshot.json and _journal.json) and regenerate the checked-in SQL patches (mysql.bootstrap.sql, mysql.upgrade.sql, postgres.bootstrap.sql, postgres.upgrade.sql) so the schema, snapshot, and SQL outputs stay in sync.

coderabbitai · 2026-04-23T14:38:21Z

+    const anyBody = body as Record<string, unknown>;
+    if (anyBody.postRefreshProbeEnabled !== undefined) updates.postRefreshProbeEnabled = Boolean(anyBody.postRefreshProbeEnabled);
+    if (anyBody.postRefreshProbeModel !== undefined) updates.postRefreshProbeModel = String(anyBody.postRefreshProbeModel || '').trim();
+    if (anyBody.postRefreshProbeScope !== undefined) updates.postRefreshProbeScope = anyBody.postRefreshProbeScope === 'all' ? 'all' : 'single';


⚠️ Potential issue | 🟡 Minor

Validate probe settings instead of silently coercing them.

Boolean(anyBody.postRefreshProbeEnabled) turns "false" into true, and any invalid postRefreshProbeScope overwrites the setting as 'single'. Reject invalid values or reuse the existing boolean normalizer.

Suggested fix

const anyBody = body as Record<string, unknown>; - if (anyBody.postRefreshProbeEnabled !== undefined) updates.postRefreshProbeEnabled = Boolean(anyBody.postRefreshProbeEnabled); + if (anyBody.postRefreshProbeEnabled !== undefined) { + const normalizedProbeEnabled = normalizePinnedFlag(anyBody.postRefreshProbeEnabled); + if (normalizedProbeEnabled === null) { + return reply.code(400).send({ error: 'Invalid postRefreshProbeEnabled value. Expected boolean.' }); + } + updates.postRefreshProbeEnabled = normalizedProbeEnabled; + } if (anyBody.postRefreshProbeModel !== undefined) updates.postRefreshProbeModel = String(anyBody.postRefreshProbeModel || '').trim(); - if (anyBody.postRefreshProbeScope !== undefined) updates.postRefreshProbeScope = anyBody.postRefreshProbeScope === 'all' ? 'all' : 'single'; + if (anyBody.postRefreshProbeScope !== undefined) { + if (anyBody.postRefreshProbeScope !== 'single' && anyBody.postRefreshProbeScope !== 'all') { + return reply.code(400).send({ error: 'Invalid postRefreshProbeScope value. Expected single or all.' }); + } + updates.postRefreshProbeScope = anyBody.postRefreshProbeScope; + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/server/routes/api/sites.ts` around lines 691 - 694, The current logic in the route uses loose coercion (Boolean(anyBody.postRefreshProbeEnabled) and forcing postRefreshProbeScope to 'single' for any non-'all' value), which accepts invalid inputs; update the handler that reads anyBody to explicitly validate inputs: for postRefreshProbeEnabled accept only true/false booleans or the string 'true'/'false' (reuse the existing boolean normalizer used elsewhere), for postRefreshProbeModel trim and accept only non-empty strings, and for postRefreshProbeScope accept only the allowed values 'all' or 'single' (reject or return a 400 for anything else) before assigning to updates.postRefreshProbeEnabled, updates.postRefreshProbeModel and updates.postRefreshProbeScope so invalid values are rejected rather than silently coerced.

coderabbitai · 2026-04-23T14:38:21Z

+  const unsupportedModels = details.filter((d) => d.status === 'unsupported' || d.status === 'inconclusive').map((d) => d.modelName);
+  if (unsupportedModels.length > 0) {
+    const checkedAt = new Date().toISOString();
+    for (const modelName of unsupportedModels) {
+      await db.update(schema.modelAvailability)
+        .set({ available: false, checkedAt })
+        .where(and(
+          eq(schema.modelAvailability.accountId, account.id),
+          eq(schema.modelAvailability.modelName, modelName),
+        ))
+        .run();
+      await db.insert(schema.siteDisabledModels)
+        .values({ siteId, modelName })
+        .onConflictDoNothing()
+        .run();
+      onProgress?.({ type: 'action', modelName, action: 'disabled' });
+    }
+    const reason = unsupportedModels.length === 1
+      ? `手动探测失败：模型 ${unsupportedModels[0]} 不可用`
+      : `手动探测失败：${unsupportedModels.length} 个模型不可用（${unsupportedModels.slice(0, 3).join('、')}${unsupportedModels.length > 3 ? '…' : ''}）`;
+    await setAccountRuntimeHealth(account.id, { state: 'unhealthy', reason, source: 'manual-probe', checkedAt });
+    rebuildTokenRoutesFromAvailability().catch((err) => {
+      console.warn('[probe-site-now] route rebuild failed', err);
+    });
+  }


⚠️ Potential issue | 🟠 Major

把 inconclusive 一律当作 unsupported 并自动加入站点禁用列表，风险较高。

根据 runtimeModelProbe.ts 的实现，inconclusive 会在以下场景返回（与“模型不可用”无关）：

缺少凭据（missing credential for probe）；

没有任何与模型匹配的上游端点（no compatible probe endpoint candidates）；

端点发现/执行阶段抛异常（网络抖动、代理失败、DNS 等瞬时错误）；

请求超时。

这些都是“探测不出结论”而非“模型确实不支持”，但当前逻辑会把它们一起写入 modelAvailability.available=false 并插入 siteDisabledModels，路由重建后该模型就不会再参与调度。最坏情况下，一次瞬时超时或代理异常就会永久性地禁用某个热门模型，且需要用户手动从站点禁用列表中移除才能恢复（下一轮刷新不会自动解禁，因为 siteDisabledModels 不会自动清理）。

建议：

仅对 unsupported 和（可选地）latencyExceeded 执行自动禁用；对 inconclusive 仅发出告警/进度事件、或写入 modelAvailability.available=false 但不写入 siteDisabledModels，让下一次探测可以自愈；

若保留当前行为，至少记录每个模型的触发原因，并在 UI 明确区分“不支持”与“无法确定”，以便用户判断是否手动恢复。

runPostRefreshProbeIfEnabled Line 551 也有同样问题，建议一并修复。

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/server/services/modelService.ts` around lines 482 - 506, The current code treats status === 'inconclusive' the same as 'unsupported' and auto-disables models (updating modelAvailability.available=false, inserting into siteDisabledModels, calling setAccountRuntimeHealth and rebuildTokenRoutesFromAvailability). Change the logic in the unsupportedModels block so only status === 'unsupported' (and optionally 'latencyExceeded' if you choose) are auto-disabled and inserted into schema.siteDisabledModels; for entries with status === 'inconclusive' instead emit a progress/warning (onProgress?.({ type: 'inconclusive', modelName, reason })) and optionally update schema.modelAvailability.available=false without inserting into siteDisabledModels or marking account unhealthy via setAccountRuntimeHealth; apply the same correction to runPostRefreshProbeIfEnabled to avoid permanent disabling from transient/inconclusive probe results.

coderabbitai · 2026-04-23T14:38:21Z

+  const handleSaveProbeSettings = async () => {
+    if (!editor || editor.mode !== 'edit') return;
+    setProbeSaving(true);
+    try {
+      await api.updateSite(editor.editingSiteId, {
+        postRefreshProbeEnabled: probeEnabled,
+        postRefreshProbeModel: probeModel.trim(),
+        postRefreshProbeScope: probeScope,
+      });
+      setSites((prev) => prev.map((s) => s.id === editor.editingSiteId
+        ? { ...s, postRefreshProbeEnabled: probeEnabled, postRefreshProbeModel: probeModel.trim(), postRefreshProbeScope: probeScope }
+        : s,
+      ));
+      toast.success('刷新后探测设置已保存');
+    } catch (e: any) {
+      toast.error(e.message || '保存失败');
+    } finally {
+      setProbeSaving(false);
+    }
+  };


⚠️ Potential issue | 🟠 Major

延迟阈值未持久化，也不会作用于“刷新后自动探测”。

probeLatencyThreshold 只存在于组件本地 state 中，且：

handleSaveProbeSettings 保存时没有把阈值写回站点（payload 只有 postRefreshProbeEnabled / Model / Scope 三个字段）。

openEdit (Line 492-498) 没有从 site 重置 probeLatencyThreshold，跨站点编辑时会把上一次设置的阈值泄露到下一个站点的 UI 上。

更重要的是，runPostRefreshProbeIfEnabled（modelService.ts Line 511-591）根本没有读取/接收延迟阈值，所以该面板的核心卖点“刷新后自动探测并按延迟自动禁用”在当前实现下完全不生效——阈值只对手动点“立即探测”的这一次会话有效。

面板文案明确写的是“ms（响应超过该时间则自动禁用，0=不限）”，这会让用户误以为保存后会在每次刷新后触发。建议：

要么在 schema 中新增 postRefreshProbeLatencyThresholdMs 字段，将其纳入 handleSaveProbeSettings 的保存、openEdit 的初始化、以及 runPostRefreshProbeIfEnabled 的探测逻辑；

要么明确区分两种阈值（UI 上标注“仅用于手动探测”），并为自动探测场景提供独立的持久化配置。

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/web/pages/Sites.tsx` around lines 558 - 577, The probe latency threshold isn't persisted or used for automatic post-refresh probes; update the component and service to persist and consume a new field (postRefreshProbeLatencyThresholdMs): add that field to the payload in handleSaveProbeSettings and to the setSites update, initialize/reset the local probeLatencyThreshold state in openEdit from site.postRefreshProbeLatencyThresholdMs, and extend runPostRefreshProbeIfEnabled (modelService.ts) to read/accept postRefreshProbeLatencyThresholdMs from the site record so automatic probes apply the saved threshold (or alternatively clearly separate and label a manual-only threshold if you choose the other approach).

… probe columns - Add drizzle/0025_site_post_refresh_probe.sql (ALTER TABLE sites ADD 3 columns) - Update drizzle/meta/_journal.json (idx=24 entry) - Regenerate mysql/postgres bootstrap+upgrade SQL and schemaContract.json

- Fix SSE abort: propagate AbortSignal to worker pool; close connection immediately on client disconnect without sending 'complete' event - Fix Boolean coercion: use strict === true/=== 1 check instead of Boolean() to avoid Boolean('false') === true foot-gun - Fix probe-now: parse and forward latencyThresholdMs from request body - Fix api.ts probeSiteNow: add latencyThresholdMs to options type - Fix backup import: restore postRefreshProbe* fields when inserting sites so that backup/restore round-trips preserve per-site probe config - Fix latency threshold persistence: add postRefreshProbeLatencyThresholdMs column (migration 0026), save/load it in site editor, pass to runPostRefreshProbeIfEnabled so auto-probe also respects the threshold - Fix state management: setProbeCompleted(false) at probe start; error branch also refreshes model lists and sets probeCompleted to show results - Regenerate schema artifacts (mysql/postgres bootstrap+upgrade, contract)

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

src/web/pages/Sites.tsx (1)

732-746: ⚠️ Potential issue | 🟡 Minor

Persist the latency threshold from the primary Save button too.

The footer save persists the other probe settings but omits postRefreshProbeLatencyThresholdMs; changing only the threshold and clicking “保存修改” silently keeps the old saved threshold unless the separate probe-settings button is used.

Suggested fix

       globalWeight: Number(parsedGlobalWeight.toFixed(3)),
       postRefreshProbeEnabled: probeEnabled,
       postRefreshProbeModel: probeModel.trim(),
       postRefreshProbeScope: probeScope,
+      postRefreshProbeLatencyThresholdMs: Math.max(0, parseInt(probeLatencyThreshold, 10) || 0),
     };

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/web/pages/Sites.tsx` around lines 732 - 746, The payload built in the
Save handler omits the postRefreshProbeLatencyThresholdMs field so changes to
the threshold aren't persisted; add a postRefreshProbeLatencyThresholdMs
property to the payload (similar to globalWeight and postRefreshProbeModel) and
source it from the form state or the parsed latency variable (e.g.,
form.postRefreshProbeLatencyThresholdMs or parsedPostRefreshProbeLatency),
converting to the appropriate numeric type before assigning so the value is
included when payload is sent.

🧹 Nitpick comments (1)

src/web/pages/Sites.tsx (1)

67-70: Add the latency threshold to SiteRow instead of casting.

Line 495 reads a persisted field through as any, so API/schema drift around postRefreshProbeLatencyThresholdMs will not be caught by TypeScript.

Suggested fix

   postRefreshProbeEnabled?: boolean;
   postRefreshProbeModel?: string | null;
   postRefreshProbeScope?: string | null;
+  postRefreshProbeLatencyThresholdMs?: number | null;

-    setProbeLatencyThreshold(String((site as any).postRefreshProbeLatencyThresholdMs ?? 0));
+    setProbeLatencyThreshold(String(site.postRefreshProbeLatencyThresholdMs ?? 0));

Also applies to: 492-495

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/web/pages/Sites.tsx` around lines 67 - 70, The code is reading
postRefreshProbeLatencyThresholdMs by casting a persisted site record to any,
which hides schema/API drift; add postRefreshProbeLatencyThresholdMs?: number |
null to the SiteRow type/interface so TypeScript knows the field exists, then
remove the as any cast and access site.postRefreshProbeLatencyThresholdMs
directly (update usages where SiteRow is constructed or read, e.g., the code
around where postRefreshProbeEnabled/postRefreshProbeModel/postRefreshProbeScope
are accessed).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/web/api.ts`:
- Around line 789-790: probeSiteNow currently calls request(...) and inherits
the default 30s timeout, which can abort long "scope: 'all'" one-shot probes;
update probeSiteNow to detect when options?.scope === 'all' and pass a longer
timeout to request (e.g., a probe-specific timeout value or constant) via the
request options so the client will wait long enough for multiple upstream probes
to complete; refer to the probeSiteNow arrow function and the request(...) call
when making this change.

In `@src/web/pages/Sites.tsx`:
- Around line 330-335: The current useEffect tied to [editor] only aborts
probeAbortRef when editor becomes null; add a separate useEffect with an empty
dependency array that returns a cleanup function to abort any active probe
stream on component unmount. In that cleanup call probeAbortRef.current?.abort()
and set probeAbortRef.current = null (same actions as in the existing effect),
so active SSE fetches are aborted if the user navigates away while the editor is
still open.

---

Duplicate comments:
In `@src/web/pages/Sites.tsx`:
- Around line 732-746: The payload built in the Save handler omits the
postRefreshProbeLatencyThresholdMs field so changes to the threshold aren't
persisted; add a postRefreshProbeLatencyThresholdMs property to the payload
(similar to globalWeight and postRefreshProbeModel) and source it from the form
state or the parsed latency variable (e.g.,
form.postRefreshProbeLatencyThresholdMs or parsedPostRefreshProbeLatency),
converting to the appropriate numeric type before assigning so the value is
included when payload is sent.

---

Nitpick comments:
In `@src/web/pages/Sites.tsx`:
- Around line 67-70: The code is reading postRefreshProbeLatencyThresholdMs by
casting a persisted site record to any, which hides schema/API drift; add
postRefreshProbeLatencyThresholdMs?: number | null to the SiteRow type/interface
so TypeScript knows the field exists, then remove the as any cast and access
site.postRefreshProbeLatencyThresholdMs directly (update usages where SiteRow is
constructed or read, e.g., the code around where
postRefreshProbeEnabled/postRefreshProbeModel/postRefreshProbeScope are
accessed).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 73b2cf0b-6b1c-4076-8b45-6aa43d155468

📥 Commits

Reviewing files that changed from the base of the PR and between 3792475 and cb50784.

⛔ Files ignored due to path filters (5)

src/server/db/generated/mysql.bootstrap.sql is excluded by !**/generated/**
src/server/db/generated/mysql.upgrade.sql is excluded by !**/generated/**
src/server/db/generated/postgres.bootstrap.sql is excluded by !**/generated/**
src/server/db/generated/postgres.upgrade.sql is excluded by !**/generated/**
src/server/db/generated/schemaContract.json is excluded by !**/generated/**

📒 Files selected for processing (8)

drizzle/0026_site_probe_latency_threshold.sql
drizzle/meta/_journal.json
src/server/db/schema.ts
src/server/routes/api/sites.ts
src/server/services/backupService.ts
src/server/services/modelService.ts
src/web/api.ts
src/web/pages/Sites.tsx

✅ Files skipped from review due to trivial changes (3)

drizzle/0026_site_probe_latency_threshold.sql
drizzle/meta/_journal.json
src/server/services/modelService.ts

🚧 Files skipped from review as they are similar to previous changes (2)

src/server/routes/api/sites.ts
src/server/services/backupService.ts

…e SQL Better-SQLite3 requires --> statement-breakpoint between each statement in a migration file; without it all statements are passed as a single string to Database.prepare() which throws RangeError on multi-statement input.

MySQL information_schema.COLUMNS.COLUMN_DEFAULT stores DEFAULT '' as an empty string, not NULL. normalizeDefaultValueForColumn was treating any empty rawDefaultValue as 'no default' and returning null, causing the MySQL schema parity test to see defaultValue: null for post_refresh_probe_model while schemaContract.json (generated from TypeScript schema) had "''". Fix: when rawDefaultValue is empty string, return "''" for text/json columns instead of null, so the round-trip through MySQL introspection matches the contract.

Refresh the PR onto current main and close the remaining probe-setting review gaps: the primary site save now persists the latency threshold, the site row type carries that field without an any-cast, active probe streams are aborted on unmount, and all-model one-shot probes get the longer timeout they need. Constraint: PR cita-777#510 must preserve the per-site model probe feature while merging cleanly into current main. Rejected: Leave threshold persistence behind the separate probe-settings button | the primary save path would still silently drop user changes. Rejected: Keep the SiteRow any-cast | it hides API/schema drift for the new persisted field. Confidence: high Scope-risk: narrow Directive: Keep probe setting fields represented in the typed site save payload and covered by focused web API/editor tests. Tested: npm run typecheck; npm run repo:drift-check; npx vitest run --root . src/server/db/schemaContract.test.ts src/server/db/schemaArtifactGenerator.test.ts src/server/db/schemaIntrospection.test.ts src/server/db/schemaParity.test.ts src/web/api.test.ts src/web/pages/helpers/sitesEditor.test.ts src/web/pages/sites.disabled-models-save.test.tsx src/web/pages/sites.edit-scroll.test.tsx Not-tested: Full npm test.

github-actions Bot added area: db Database and schema related changes area: web Web UI changes size: L 500 to 999 lines changed labels Apr 23, 2026

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

Xiang Li added 2 commits April 23, 2026 22:50

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/web/api.ts Outdated

Comment thread src/web/pages/Sites.tsx

Xiang Li and others added 3 commits April 23, 2026 23:21

cita-777 merged commit f414867 into cita-777:main May 8, 2026
18 checks passed

Conversation

Babylonehy commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR

Title

Description

问题 / Problem

方案 / Solution

测试 / Tests

不变 / What Does NOT Change

影响文件 / Changed Files

Summary by CodeRabbit

Uh oh!

chatgpt-codex-connector Bot commented Apr 23, 2026

Uh oh!

coderabbitai Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Babylonehy commented Apr 23, 2026 •

edited

Loading

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading