Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 102 additions & 0 deletions .planning/quick/8-issue-230-build-structured-diagnostic-lo/8-PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
phase: quick-8
plan: 8
type: quick-full
wave: 1
depends_on: []
files_modified:
- lib/agent-diagnostics.cjs
autonomous: true
requirements:
- Create lib/agent-diagnostics.cjs for structured diagnostic logging of agent executions
- Capture agent type, prompt hash, start/end timestamps, turn count, exit reason, output size, failure classification
- Write diagnostic entries to .mgw/diagnostics/<issue-number>-<timestamp>.json
- Include prune function to remove entries older than 30 days
- Logger must be non-blocking — failures in logging never halt the pipeline
- Integrate with agent failure taxonomy from lib/agent-errors.cjs
must_haves:
truths:
- lib/agent-diagnostics.cjs exports createDiagnosticLogger, writeDiagnosticEntry, pruneDiagnostics, and getDiagnosticsDir
- Diagnostic entries are JSON files at .mgw/diagnostics/<issue-number>-<timestamp>.json
- Each entry captures agent_type, prompt_hash, start_time, end_time, duration_ms, turn_count, exit_reason, output_size, and failure_classification
- The prune function removes entries older than 30 days
- All logging operations are wrapped in try/catch — failures never propagate
- The failure_classification field uses types from lib/agent-errors.cjs (AGENT_FAILURE_TYPES)
artifacts:
- lib/agent-diagnostics.cjs
key_links:
- lib/agent-errors.cjs (agent failure taxonomy — defines AGENT_FAILURE_TYPES, classifyAgentFailure)
- lib/errors.cjs (MgwError base class)
- lib/retry.cjs (pipeline retry infrastructure — classifyFailure)
- lib/logger.cjs (existing structured logging — pattern reference)
---

## Objective

Build `lib/agent-diagnostics.cjs` — a structured diagnostic logger for GSD agent executions. The module captures per-agent-invocation telemetry (timing, turns, output size, exit reason, failure classification) and persists it as individual JSON files under `.mgw/diagnostics/`. All operations are non-blocking; logging failures are silently swallowed to ensure the pipeline is never halted by diagnostic infrastructure.

## Context

- **Dependency:** This builds on issue #229's agent failure taxonomy in `lib/agent-errors.cjs`, which defines `AGENT_FAILURE_TYPES` and `classifyAgentFailure()`. The diagnostics logger uses these to classify failures in diagnostic entries.
- **Pattern reference:** `lib/logger.cjs` demonstrates the existing non-blocking logging pattern (try/catch swallowing, `.mgw/` directory structure).
- **Integration point:** The diagnostics module will be consumed by future hooks (issue #231) that instrument agent spawns.

## Tasks

### Task 1: Create lib/agent-diagnostics.cjs

**files:** `lib/agent-diagnostics.cjs`
**action:** Create the diagnostic logger module with these exports:

1. **`getDiagnosticsDir(repoRoot?)`** — Returns `.mgw/diagnostics/` path, creates directory if needed. Pattern follows `getLogDir()` from `lib/logger.cjs`.

2. **`createDiagnosticLogger(opts)`** — Factory that returns a logger instance bound to a specific agent invocation. Accepts:
- `agentType` (string) — GSD agent type (gsd-planner, gsd-executor, etc.)
- `issueNumber` (number) — GitHub issue being worked
- `promptHash` (string, optional) — Hash of the prompt sent to the agent
- `repoRoot` (string, optional) — Repo root for diagnostics dir

Returns an object with:
- `start()` — Records start_time
- `finish(result)` — Records end_time, calculates duration, writes entry
- `result` fields: `exitReason` (string), `turnCount` (number), `outputSize` (number), `error` (Error, optional)

3. **`writeDiagnosticEntry(entry, opts?)`** — Low-level write function. Writes a single diagnostic JSON to `.mgw/diagnostics/<issueNumber>-<timestamp>.json`. Fields:
- `agent_type`, `prompt_hash`, `start_time`, `end_time`, `duration_ms`
- `turn_count`, `exit_reason`, `output_size`
- `failure_classification` — null on success, or result of `classifyAgentFailure()` from `lib/agent-errors.cjs` on failure
- `issue_number`, `timestamp`

4. **`pruneDiagnostics(opts?)`** — Removes diagnostic files older than `maxAgeDays` (default 30). Scans `.mgw/diagnostics/`, parses filenames for timestamps, removes expired entries. Non-blocking.

5. **`readDiagnostics(opts?)`** — Read diagnostic entries with optional filters (issueNumber, agentType, since). Returns parsed JSON array sorted by timestamp descending.

**verify:**
- Module loads without errors: `node -e "require('./lib/agent-diagnostics.cjs')"`
- All five exports exist and are functions
- Non-blocking: wrapping in try/catch is not needed by callers

**done:** Module file exists and exports all functions.

## Verification

- [ ] `lib/agent-diagnostics.cjs` exists and loads cleanly
- [ ] Exports: `getDiagnosticsDir`, `createDiagnosticLogger`, `writeDiagnosticEntry`, `pruneDiagnostics`, `readDiagnostics`
- [ ] Diagnostic entry JSON schema matches spec (agent_type, prompt_hash, start_time, end_time, duration_ms, turn_count, exit_reason, output_size, failure_classification, issue_number, timestamp)
- [ ] Prune function defaults to 30-day retention
- [ ] All I/O operations wrapped in try/catch — never throws
- [ ] References `classifyAgentFailure` from `lib/agent-errors.cjs` for failure classification (graceful fallback if module not available)
- [ ] Follows existing lib/ conventions (JSDoc, 'use strict', module.exports)

## Success Criteria

- The module is self-contained and requires no changes to existing files
- Callers can instrument agent executions with `createDiagnosticLogger()` start/finish pattern
- Diagnostic data persists across pipeline runs for observability
- The prune function prevents unbounded storage growth
- Zero risk of pipeline disruption from logging failures

## Output

- `lib/agent-diagnostics.cjs` — complete implementation
- `.planning/quick/8-issue-230-build-structured-diagnostic-lo/8-SUMMARY.md` — execution summary
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
## Summary

**One-liner:** Created `lib/agent-diagnostics.cjs` — a non-blocking diagnostic logger that captures per-agent-invocation telemetry and writes structured JSON entries to `.mgw/diagnostics/`.

### What Was Built

1. **`lib/agent-diagnostics.cjs`** — Complete diagnostic logger module with 7 exports:
- `getDiagnosticsDir(repoRoot?)` — Returns/creates `.mgw/diagnostics/` directory
- `createDiagnosticLogger(opts)` — Factory returning `{ start(), finish(result) }` logger bound to an agent invocation
- `writeDiagnosticEntry(entry, opts?)` — Low-level JSON file writer for diagnostic entries
- `pruneDiagnostics(opts?)` — Removes entries older than 30 days (configurable)
- `readDiagnostics(opts?)` — Query entries with filters (issueNumber, agentType, since, limit)
- `shortHash(input)` — SHA-256 utility for prompt hashing (12 hex chars)
- `DEFAULT_MAX_AGE_DAYS` — Constant (30)

### Diagnostic Entry Schema

Each JSON file at `.mgw/diagnostics/<issueNumber>-<timestamp>.json` contains:
- `agent_type` — GSD agent type (gsd-planner, gsd-executor, etc.)
- `prompt_hash` — 12-char SHA-256 hash of the prompt
- `start_time` / `end_time` — ISO timestamps
- `duration_ms` — Wall-clock execution time
- `turn_count` — Number of agent turns/iterations
- `exit_reason` — Why the agent stopped (success, error, timeout, etc.)
- `output_size` — Agent output size in bytes
- `failure_classification` — null on success, or classification from agent-errors.cjs
- `issue_number` — GitHub issue number
- `timestamp` — Entry creation timestamp

### Key Design Decisions

1. **Graceful fallback for agent-errors.cjs:** Since PR #238 (issue #229) isn't merged yet, the classification function falls back through `lib/retry.cjs` then to a minimal classification if neither module is available.

2. **Non-blocking guarantees:** Every public function wraps all I/O in try/catch blocks. `writeDiagnosticEntry()` returns `boolean`, `pruneDiagnostics()` returns a result object with error counts, `readDiagnostics()` returns empty arrays on failure. No function ever throws.

3. **File-per-entry storage:** Individual JSON files (not JSONL) enable per-entry deletion for pruning and straightforward reads without parsing.

4. **Filesystem-safe timestamps:** ISO timestamps in filenames have colons and dots replaced with hyphens.

### Files Created

| File | Lines | Purpose |
|------|-------|---------|
| `lib/agent-diagnostics.cjs` | 451 | Diagnostic logger module |

### Testing

Verified with 19 assertions covering:
- Hash generation (correct length, null handling)
- Directory creation
- Entry write/read round-trip
- Logger factory start/finish pattern
- Error classification integration (fallback path)
- Prune function (no false positives on recent entries)
- Non-blocking behavior on invalid inputs
- Filter functionality (agentType, issueNumber, limit)

### Integration Notes

- Ready for issue #231 (diagnostic capture hooks) to instrument agent spawns
- When PR #238 merges, failure classification will automatically upgrade to use `classifyAgentFailure()` from `lib/agent-errors.cjs`
- Follows existing lib/ conventions: `'use strict'`, JSDoc, `module.exports` pattern
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
## Verification Passed

### Must-Have Checks

| # | Must-Have | Status | Evidence |
|---|-----------|--------|----------|
| 1 | lib/agent-diagnostics.cjs exports getDiagnosticsDir, createDiagnosticLogger, writeDiagnosticEntry, pruneDiagnostics, readDiagnostics | PASS | Module loads, all 5 core functions + shortHash + DEFAULT_MAX_AGE_DAYS exported |
| 2 | Diagnostic entries are JSON files at .mgw/diagnostics/<issueNumber>-<timestamp>.json | PASS | writeDiagnosticEntry creates files with correct naming pattern |
| 3 | Each entry captures agent_type, prompt_hash, start_time, end_time, duration_ms, turn_count, exit_reason, output_size, failure_classification | PASS | All 9 fields present in written JSON + issue_number + timestamp |
| 4 | Prune function removes entries older than 30 days | PASS | pruneDiagnostics defaults to 30 days, uses file mtime for age calculation |
| 5 | All logging operations wrapped in try/catch — failures never propagate | PASS | Every public function has top-level try/catch; returns safe defaults on error |
| 6 | failure_classification uses types from lib/agent-errors.cjs | PASS | Graceful fallback: tries agent-errors.cjs first, then retry.cjs, then minimal |

### Artifact Checks

| Artifact | Exists | Valid |
|----------|--------|-------|
| lib/agent-diagnostics.cjs | Yes | 451 lines, loads without errors |

### Key Link Checks

| Link | Exists | Referenced Correctly |
|------|--------|---------------------|
| lib/agent-errors.cjs | On PR branch (not main) | Graceful require with fallback |
| lib/errors.cjs | Yes | Not directly required (via agent-errors.cjs) |
| lib/retry.cjs | Yes | Used as fallback classifier |
| lib/logger.cjs | Yes | Pattern reference (not imported) |

### Functional Verification

- 19/19 assertions passed
- Non-blocking behavior confirmed on null inputs and invalid paths
- Error classification fallback path confirmed working
- Read filtering (issueNumber, agentType, limit) verified
- Prune function correctly skips recent entries
Loading