Skip to content

Fix dashboard backend leaks#33

Merged
Protocol-zero-0 merged 1 commit into
masterfrom
feat/backend-stability
Jun 8, 2026
Merged

Fix dashboard backend leaks#33
Protocol-zero-0 merged 1 commit into
masterfrom
feat/backend-stability

Conversation

@Protocol-zero-0

Copy link
Copy Markdown
Contributor

Closes #32.

Changed files

  • web/app.py
  • web/static/js/app.js
  • web/templates/index.html
  • web/static/js/i18n.js
  • tests/test_web_app.py

What changed

  • Replaced /api/events SSE streaming with short-polling JSON: GET /api/events?since=<seq> returns once with {events,next_seq} and Cache-Control: no-cache.
  • Replaced the dashboard EventSource client with fetchEvents() plus setInterval(fetchEvents, 2000), so open tabs no longer hold waitress worker threads.
  • Added @app.teardown_request rollback protection so read requests close any open transaction before the thread-local connection is reused.
  • Added data-i18n-title tooltips to all 9 overview stat cards and added matching English/Chinese .tip i18n keys.
  • Added regression tests for the short-polling event contract, teardown rollback, removal of EventSource usage, and exact tooltip/i18n coverage.

Tests added

  • /api/events?since= returns immediate JSON, preserves datetime serialization, and advances next_seq.
  • Read request teardown calls db.rollback().
  • Dashboard JS uses short polling and no longer contains application EventSource usage.
  • All 9 stat cards have data-i18n-title, and en/zh tooltip keys and copy match the issue table.

Tests run

  • Baseline attempt before changes:
    • pytest tests/test_web_app.py -q
    • Result: failed during collection because the workspace default pytest uses Homebrew Python 3.14 without flask installed (ModuleNotFoundError: No module named 'flask').
  • Baseline with project virtualenv before changes:
    • .venv/bin/pytest tests/test_web_app.py -q
    • Result: 13 passed in 4.53s
  • New tests before implementation:
    • .venv/bin/pytest tests/test_web_app.py -q
    • Result: expected failures in 4 areas: SSE event contract, missing teardown rollback, EventSource still present, missing tooltip titles.
  • Final protocol run:
    • source .venv/bin/activate && pytest tests/test_web_app.py -q
    • Result: 16 passed in 3.52s
  • Additional final run:
    • .venv/bin/pytest tests/test_web_app.py -q
    • Result: 16 passed in 3.58s
  • Import check:
    • PATH=.venv/bin:$PATH python -c "import web.app; print('import web.app ok')"
    • Result: import web.app ok
  • Route check with an initialized temporary SQLite DB:
    • / -> 200 text/html
    • /api/stats -> 200 application/json
    • /api/events?since=0 -> 200 application/json
  • Static SSE scan:
    • rg -n "EventSource|text/event-stream|while True:|X-Accel-Buffering|startSSE|sseRetryDelay" web/app.py web/static/js/app.js tests/test_web_app.py
    • Result: only the regression test assertion mentions EventSource; application code has no SSE endpoint/client remnants.

Before / after

  • Before: /api/events returned text/event-stream from an infinite synchronous generator, holding a waitress worker thread for each tab.
  • After: /api/events?since=<seq> returns ordinary JSON immediately, so browser polling does not occupy a web worker between polls.
  • Before: read endpoints could leave PostgreSQL transactions open on reused thread-local connections.
  • After: request teardown calls db.rollback() in a defensive try/except, closing read transactions after each request.
  • Before: overview stat labels had no hover descriptions.
  • After: all 9 stat cards have localized title tooltips, and applyI18n updates them through the existing data-i18n-title support.

Acceptance evidence

  • Local configured backend in this isolated workspace:
    • .venv/bin/python -c "from db import database; print(database.describe_backend())"
    • Result: {'backend': 'sqlite', 'database_url_configured': False, 'sqlite_path': '/home/ubuntu/Deepgraph/deepgraph.db', 'sqlite_exists': True, 'target': '/home/ubuntu/Deepgraph/deepgraph.db'}
  • Local /api/stats pressure check with initialized temporary SQLite DB:
    • 100 concurrent-ish requests via 16 workers
    • Result: requests=100 status_200=100 failures=0
    • Result: p50_ms=80.77 p95_ms=254.30 max_ms=412.02
  • PostgreSQL idle-in-transaction evidence:
    • Not executed in this isolated workspace because no DEEPGRAPH_DATABASE_URL is configured, docker is not installed, and psql is not installed.
    • The code path required for the production check is covered by the teardown rollback regression test.
    • Deployment should run the issue SQL after read-request pressure:
      SELECT count(*) FROM pg_stat_activity WHERE state='idle in transaction';

Non-goals

  • Did not change stats SQL/counting semantics.
  • Did not change database schema.
  • Did not touch agents/, orchestrator/, or contracts/ business logic.
  • Did not introduce a frontend framework or redesign the dashboard.

Copilot AI review requested due to automatic review settings June 8, 2026 21:10

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses dashboard backend stability issues by removing long-lived SSE connections that can exhaust waitress worker threads, and by adding request teardown rollback to prevent thread-local DB connections from remaining “idle in transaction”. It also adds localized tooltip titles for the 9 overview stat cards, plus regression tests to lock in the new behavior.

Changes:

  • Replaced /api/events SSE streaming with short-polling JSON (GET /api/events?since=<seq>).
  • Added @app.teardown_request rollback to close out any open transaction on reused thread-local connections.
  • Added data-i18n-title tooltips for all overview stat cards and corresponding en/zh i18n keys, with regression tests.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
web/app.py Replaces SSE with short-polling JSON events endpoint and adds teardown rollback.
web/static/js/app.js Replaces EventSource client logic with periodic polling of /api/events?since=….
web/templates/index.html Adds data-i18n-title tooltip markers to all 9 overview stat cards.
web/static/js/i18n.js Adds en/zh tooltip translation keys and content for the stat cards.
tests/test_web_app.py Adds regression tests for the polling contract, teardown rollback, EventSource removal, and tooltip/i18n coverage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread web/app.py
Comment on lines +86 to +91
@app.teardown_request
def rollback_request_transaction(_exc):
try:
db.rollback()
except Exception:
pass
Comment thread web/app.py
Comment on lines +1159 to +1163
"""Short-poll pipeline events without holding a web worker thread."""
since = max(0, request.args.get("since", 0, type=int) or 0)
events = get_events(since)
payload_events = json.loads(json.dumps(events, ensure_ascii=False, default=str))
next_seq = since
Comment thread web/static/js/app.js
Comment on lines +179 to +184
// ── Event Polling ────────────────────────────────────────────────────

let sseRetryDelay = 2000;

function startSSE() {
if (eventSource) {
try { eventSource.close(); } catch(e) {}
eventSource = null;
}
eventSource = new EventSource('/api/events');

eventSource.onopen = () => {
sseRetryDelay = 2000;
};

eventSource.onmessage = (msg) => {
try {
const ev = JSON.parse(msg.data);
async function fetchEvents() {
try {
const payload = await api(`/api/events?since=${eventsSince}`);
eventsSince = payload.next_seq || eventsSince;
@Protocol-zero-0 Protocol-zero-0 merged commit f32cbde into master Jun 8, 2026
3 checks passed
@Protocol-zero-0 Protocol-zero-0 deleted the feat/backend-stability branch June 9, 2026 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dashboard 后端稳定性:修两处连接/线程泄漏 + 首页指标悬停说明

2 participants