refactor(memory): append-only writes + retrieval-driven consolidation (fixes #34) by DeerGoat · Pull Request #41 · cyzus/suzent

DeerGoat · 2026-05-29T17:03:17Z

Summary

Closes #34. Supersedes #36.

Replaces the cosine-similarity dedup (_deduplicate_and_store_facts, fixed 0.85 threshold) that silently dropped factual updates with an append-only write path plus an autonomous "dream" consolidation agent that tidies the daily logs into a notebook wiki. Cosine measures topical proximity, not factual identity — so no threshold can fix it; the fix removes the write-time dedup entirely and moves consolidation off the critical path.

Full design + the multi-pass audit trail: docs/03-developing/memory-consolidation-plan.md.

What changed (3 phases)

Phase 1 — Append-only writes (the #34 fix)

Delete _deduplicate_and_store_facts + the 0.85 threshold (and the legacy process_message_for_memories / MarkdownIndexer). Facts are appended to the daily log and indexed per fact, with no write-time dedup — updates can no longer be dropped.
Platform guard in chat_processor: forked/system turns (platform="dream", sub-agents) never feed memory extraction or transcript indexing (fixes a latent sub-agent recursion too).

Phase 2 — Always-on notebook vault + watermark-aware indexing

Bootstrap the vault unconditionally at CONFIG.notebook_dir (zones + schema/index/log); the /mnt/notebook mount becomes an optional redirect.
CoreMemoryFileIndexer is now the sole LanceDB writer: indexes notebook pages, drops consolidated archives by a log.md watermark= token, skips tombstoned facts, constant importance (ranking = relevance + recency).
Mutation invariant: nothing mutates LanceDB directly. Deletion = edit file + reindex (or an indexer-consulted tombstone for immutable logs), so nothing resurrects.

Phase 3 — DreamRunner (autonomous wiki keeper)

A gated BaseBrain background agent (forked, file-tool-whitelisted) consolidates daily logs into the vault, resolving correction / state-change (keeps history) / duplicate / genuine-conflict (escalates) cases. It owns the watermark (runner-written, proof-of-work-gated), runs catch-up sprints for backlogs with retry-then-skip, regenerates MEMORY.md, and pauses the file watcher while it works.
POST /memory/consolidate triggers it on demand; POST /memory/reindex {clear_existing:true} rebuilds the index from files.

Verification

All changed/new modules compile and import cleanly in the project venv (no circular deps); suzent.server route wiring imports.
Memory test suite green (14 passed); MemoryExtractionResult test updated for the slimmed model.
Smoke-tested the pure logic: per-fact archive parsing (category + tags), watermark round-trip, tombstone normalize/match, recall log, vault bootstrap (zones + nav), notebook page listing.
Grep-verified no dangling references to any deleted symbol across src/.

Migration

After deploy, run POST /memory/reindex {"clear_existing": true} once to rebuild the index from files (memory + notebook). Raw daily logs are the immutable source of truth, so nothing is at risk.

Notes for review

The full design and every audit finding (C1–C5, M1–M5, NEW-1…11, §A/§B) are documented inline in the plan doc with the resolution for each.
pytest-asyncio is required to run the async memory tests locally (already a project dev need).

🤖 Generated with Claude Code

…onsolidation architecture This is a planning commit for discussion. No code changes yet. Closes #34

…tecture-append-only

…fixes #34) Replace the cosine-threshold dedup that silently dropped factual updates with an append-only write path plus an autonomous "dream" consolidation agent that tidies the daily logs into a notebook wiki. Phase 1 - append-only writes (the #34 fix): - Remove _deduplicate_and_store_facts and the 0.85 similarity threshold; facts are appended to the daily log and indexed per-fact, with no write-time dedup. - Platform guard: forked/system turns (dream, sub-agents) never feed memory extraction or transcript indexing. Phase 2 - always-on notebook vault + watermark-aware indexing: - Bootstrap the vault unconditionally at CONFIG.notebook_dir (zones + nav files). - CoreMemoryFileIndexer is the sole LanceDB writer: indexes notebook pages, drops consolidated archives by a log.md watermark, skips tombstoned facts, constant importance. Delete the broken MarkdownIndexer. - Deletion = edit file + reindex / tombstone (never a direct LanceDB mutation). Phase 3 - DreamRunner (autonomous wiki keeper): - Gated background agent (time+volume / catch-up + retry-skip + proof-of-work) that consolidates daily logs into the vault, owns the watermark, and regenerates MEMORY.md. POST /memory/consolidate triggers it on demand. Design + audit trail: docs/03-developing/memory-consolidation-plan.md Closes #34 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

DeerGoat · 2026-06-09T16:52:17Z

Implementation walkthrough (review guide)

This documents the exact runtime behavior of the change so reviewers can follow the code. The conceptual design is in docs/03-developing/memory-consolidation-plan.md; this is the "what the code actually does" companion.

TL;DR data flow

chat turn ──> extract facts (LLM) ──> append to archive/YYYY-MM-DD.md ──> reindex that day's log
                                          (source of truth)                (LanceDB, per fact)

dream (gated, background) ──> forked agent reads archive logs ──> writes/updates vault pages
                                                              ──> runner writes watermark to log.md
                                                              ──> reindex vault + drop consolidated logs
                                                              ──> regenerate MEMORY.md

LanceDB is a disposable projection — every row is (re)written only by CoreMemoryFileIndexer. Nothing else mutates it.

1. Where things live on disk

Path	What	Written by
`<sandbox>/shared/memory/archive/YYYY-MM-DD.md`	append-only daily fact log (source of truth)	`markdown_store.append_daily_log`
`<sandbox>/shared/memory/{persona,user,MEMORY}.md`	always-visible core blocks	agent / `promote_memory_md`
`~/.suzent/notebook/` (`CONFIG.notebook_dir`)	the wiki vault: `schema.md`, `index.md`, `log.md`, zones `0_Inbox … 5_Archives`	`WikiManager` (bootstrap) + the dream agent
`~/.suzent/notebook/.state/recall_log.jsonl`	retrieval events (usage signal)	`markdown_store.append_recall`
`~/.suzent/notebook/.state/tombstones.jsonl`	user-deleted facts (normalized content)	`markdown_store.append_tombstone`
`~/.suzent/notebook/log.md`	journal; holds the `watermark=YYYY-MM-DD` token	`DreamRunner` (and manual `/ingest`)
`<sandbox>/shared/memory/.index_state.json`	per-file mtime state for the indexer	`CoreMemoryFileIndexer`

/mnt/notebook is a default mount injected by config.get_effective_volumes() pointing at CONFIG.notebook_dir (a user-supplied /mnt/notebook mapping overrides it). The vault is always bootstrapped now (the old if notebook_host_path: gate in lifecycle.py is removed).

2. Write path (per turn) — `manager.process_conversation_turn_for_memories`

_extract_facts_llm(turn) → list of ExtractedFact (unchanged extraction).
_write_facts_to_markdown appends - [category] content \tags`lines under a## HH:MM — chatidheader toarchive/.md; returns the date. (category or "general"fixes theNone` bug.)
_core_indexer.reindex_file_now(label="archive", filename="<date>.md"):
- delete_memories_by_source_date(date) then re-parse the file into one row per fact line and embed each, then record the file's mtime.
- Delete-then-add → idempotent and race-free with the 300 s watcher (no duplicate rows even if both run).

No similarity check, no dedup gate anywhere on this path. This is the [FEAT] Memory deduplication: fixed cosine threshold causes silent data loss #34 fix.

⚠️ Cost to scrutinize: because we re-index the whole current-day log on every turn (not just the new line), a busy day re-embeds today's growing log repeatedly (~O(n²)/day in embeddings). This was a deliberate simplification over per-line delta-indexing, which had a watcher dup-race. The plan (§A/C4/NEW-3) describes the delta optimization as a clean follow-up. Embeddings are the cheap/local role model, and this runs in the background post-turn job, but it's the main perf trade to weigh.

3. Indexing behavior — `CoreMemoryFileIndexer`

Sole LanceDB writer. Shared single instance (manager._core_indexer), used by the per-turn path, the background watcher, and the dream. An asyncio.Lock serializes all mutations.
Granularity: diary logs → one row per fact (_parse_archive_facts); notebook pages + core files → one row per paragraph chunk (_chunk_by_paragraphs).
Watermark-aware archives: reads the watermark= token from log.md. A log with date <= watermark is dropped from the index once (and skipped thereafter) — its facts now live in vault pages. Logs > watermark index normally.
Tombstone-skip: diary facts whose normalized content is tombstoned are not indexed (so a deleted log fact can't resurrect, even on a full rebuild).
Importance is a constant 0.5 for every row (ranking is relevance + recency; importance is no longer a tuned lever).
clear_and_full_reindex(...) wipes the user's rows and rebuilds from files (core + notebook + post-watermark archives). Backs POST /memory/reindex {clear_existing:true}.

LanceDB row metadata schema (per source):

// diary fact            // notebook page chunk           // core file chunk
{ source_type:"archive_log",   { source_type:"notebook",        { source_type:"core_file",
  source_file:"2026-06-09.md",   source_file:"3_Personal/x.md",   source_file:"MEMORY.md",
  category:"career",             chunk_index, label:"notebook",   chunk_index, label:"facts",
  tags:["work"] }                category, tags }                 category, tags }

4. Retrieval

Unchanged in shape (retrieve_relevant_memories FTS hook per turn; search_memories hybrid for the agent tool), but the index surface is now core files + notebook pages + post-watermark daily logs. Both paths call _log_recalls(...), appending retrieved snippets to recall_log.jsonl (the promotion signal).

5. Consolidation — `DreamRunner` (`core/dream_runner.py`)

A BaseBrain started/stopped in lifecycle.init_memory_system / shutdown_memory_system (only when markdown + embedding are configured and memory_consolidation_enabled).

Gate (_tick, every interval_seconds, default 1800):

pending = archive dates > watermark and strictly before today UTC (never touches the in-progress day).
behind = len(pending) > max_days → catch-up sprint (ignores the daily/volume gate, one batch per tick).
else steady state: require now - last_attempt_at >= min_hours (in-memory attempt clock — not the watermark) and fact_lines(pending) >= min_facts.

Run (_run_dream, under a lock):

batch = pending[:max_days], w_new = batch[-1]. If this batch has failed >= max_retries times → skip it (advance watermark past it; its facts stay in the immutable log) so one bad day can't wedge the backlog.
Snapshot content-page mtimes; pause the watcher (a lifecycle.core_watcher_gate Event the watcher awaits).
Reset the persistent hidden system-dream chat (agent_state=b"", messages=[] — note: None means "no change" in update_chat, hence b"") and run a forked agent via ChatProcessor.process_turn_text with config {platform:"dream", memory_enabled:False, auto_approve_tools:True, tools:CONFIG.memory_dream_tools, static_instructions:DREAM_SYSTEM_PROMPT}, message = DREAM_INSTRUCTIONS.format(start=watermark, end=w_new), wrapped in asyncio.wait_for(timeout).
Resume the watcher.
Proof of work: if no content page changed → don't advance; bump the failure counter; back off. Otherwise: write the watermark=w_new entry to log.md, run promote_memory_md, then check_and_update (indexes new pages + drops archives <= w_new).

Why it's confined: memory_dream_tools is file tools + MemorySearchTool only (AgentTool is never granted → no recursion); PathResolver limits writes to /shared + /mnt/notebook; memory_enabled:False means get_or_create_agent injects no memory context (clean prompt) and the _is_system_chat guard skips extraction/transcript for platform=="dream". The agent is bound to a local var per turn, so a concurrent user turn can't swap its toolset.

promote_memory_md: reads 3_Personal/*.md + recent recall snippets → one LLM call → writes MEMORY.md (capped at memory_max_lines). This replaces the old refresh_core_memory_facts once the dream is active (the old method is kept so MEMORY.md stays fresh between deploy and the first dream run).

POST /memory/consolidate → DreamRunner.force_run() (bypasses the time/volume gate, not the lock).

6. Deletion — `delete_archival_memory`

Never mutates LanceDB ad-hoc. Resolves the row via store.get_memory(id), then:

archive fact → append_tombstone(content) + reindex_file_now(archive) → the fact is gone from the index and can't resurrect.
notebook/core row → append_tombstone + best-effort delete_memory(id). ⚠️ Known limitation: a fact baked into a notebook page can reappear on a full reindex until the page itself is edited (queued for a dream/lint pass). Worth a reviewer's eye.

7. The platform guard — `chat_processor._is_system_chat`

Reads the chat's config.platform; for {"dream","subagent","subagent_wakeup"} it skips both the transcript write (B1) and memory extraction (B2). This fixes the dream-recursion and also a pre-existing latent bug where sub-agents (which set memory_enabled:False) were still extracted because B2 gated only on global CONFIG.memory_enabled.

8. Config knobs (all in `config/init.py`)

Knob	Default	Meaning
`notebook_dir`	`~/.suzent/notebook`	vault location
`memory_consolidation_enabled`	`True`	master switch for the dream
`memory_consolidation_min_hours`	`24.0`	steady-state cadence (per attempt)
`memory_consolidation_min_facts`	`20`	steady-state volume gate
`memory_consolidation_interval_seconds`	`1800`	how often the gate is checked
`memory_consolidation_timeout_seconds`	`600`	per-run agent timeout
`memory_consolidation_max_days`	`14`	batch size; backlog beyond this triggers catch-up
`memory_consolidation_max_retries`	`3`	no-op batches before skip
`memory_consolidation_memory_max_lines`	`200`	`MEMORY.md` cap
`memory_consolidation_model`	`None`	optional model override for the dream
`memory_dream_tools`	file tools + `MemorySearchTool`	dream whitelist

9. File-by-file review map

File	What to look at
`memory/manager.py`	append-only `process_conversation_turn_for_memories`; deletions; `promote_memory_md`; `_log_recalls`; `refresh_core_memory_facts` retained
`memory/indexer.py`	`MarkdownIndexer` removed; lock + `reindex_file_now` + `clear_and_full_reindex`; per-fact `_parse_archive_facts`; watermark + tombstone logic in `_check_and_update_impl`/`_reindex_file`
`memory/markdown_store.py`	vault/page helpers, `read_watermark`/`write_watermark_entry`, recall + tombstone helpers
`memory/wiki_manager.py`	always-on bootstrap + zones
`memory/lifecycle.py`	always-create vault; shared indexer; `core_watcher_gate`; start/stop `DreamRunner`
`core/dream_runner.py`	new — gate, catch-up, retry-skip, proof-of-work, forked agent
`core/chat_processor.py`	`_is_system_chat` guard on B1+B2
`memory/lancedb_store.py`	`get_memory(id)` helper
`routes/{memory,session}_routes.py`	`/memory/consolidate`; file-edit delete; reindex delegates to the indexer
`memory/models.py`	`MemoryExtractionResult` slimmed (no `memories_created/updated`)
`config/__init__.py`	knobs + `/mnt/notebook` default mount

10. Known limitations / things to weigh

Per-turn re-embed cost (§2 above) — the main perf trade; delta-indexing is the documented follow-up.
Transient duplicate window — a fresh fact lives in both the raw log and (after consolidation) a page until the watermark advances; strictly better than the old silent drop.
Notebook-page deletion is best-effort (§6).
recall_log.jsonl isn't truncated yet — it grows until a future cleanup (open question in the plan).
Single-user — one vault under CONFIG.user_id (pre-existing assumption).
Phase 3 is runtime-exercised only — imports, DB/agent signatures, and the chat-reset are verified; the full forked-agent run happens at runtime (gated, or via POST /memory/consolidate).

11. How to exercise manually

Deploy, then POST /memory/reindex {"clear_existing": true} (rebuild index from files).
Chat a few turns → check archive/<today>.md grows and memory_search returns the new facts (no drops).
POST /memory/consolidate → check the dream writes pages under ~/.suzent/notebook/, appends a watermark= line to log.md, and refreshes MEMORY.md.
Async memory tests need pytest-asyncio (uv pip install pytest-asyncio); tests/memory is green.

DeerGoat and others added 4 commits May 30, 2026 00:54

refactor(memory): replace cosine-threshold dedup with append-only + c…

32847cf

…onsolidation architecture This is a planning commit for discussion. No code changes yet. Closes #34

Merge remote-tracking branch 'origin/main' into refactor/memory-archi…

6c1a9dc

…tecture-append-only

Merge remote-tracking branch 'origin/main' into refactor/memory-archi…

baccbb4

…tecture-append-only

DeerGoat marked this pull request as ready for review June 9, 2026 16:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(memory): append-only writes + retrieval-driven consolidation (fixes #34)#41

refactor(memory): append-only writes + retrieval-driven consolidation (fixes #34)#41
DeerGoat wants to merge 4 commits into
mainfrom
refactor/memory-architecture-append-only

DeerGoat commented May 29, 2026 •

edited

Loading

Uh oh!

DeerGoat commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DeerGoat commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed (3 phases)

Verification

Migration

Notes for review

Uh oh!

DeerGoat commented Jun 9, 2026

Implementation walkthrough (review guide)

TL;DR data flow

1. Where things live on disk

2. Write path (per turn) — manager.process_conversation_turn_for_memories

3. Indexing behavior — CoreMemoryFileIndexer

4. Retrieval

5. Consolidation — DreamRunner (core/dream_runner.py)

6. Deletion — delete_archival_memory

7. The platform guard — chat_processor._is_system_chat

8. Config knobs (all in config/__init__.py)

9. File-by-file review map

10. Known limitations / things to weigh

11. How to exercise manually

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DeerGoat commented May 29, 2026 •

edited

Loading

2. Write path (per turn) — `manager.process_conversation_turn_for_memories`

3. Indexing behavior — `CoreMemoryFileIndexer`

5. Consolidation — `DreamRunner` (`core/dream_runner.py`)

6. Deletion — `delete_archival_memory`

7. The platform guard — `chat_processor._is_system_chat`

8. Config knobs (all in `config/init.py`)