You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
capture_lesson currently records a lesson with a static confidence field set at capture time. Subsequent references via vault_search, vault_query, or capture_lesson retrieval do NOT update any usage signal. There is no mechanism to distinguish a high-impact recurrent lesson from a one-shot capture after the fact — both rank identically on BM25 keyword match.
At small corpus (< ~100 lessons) this is fine. Current state across the knowledge vault: 313 lessons across 18 projects (kubelab alone has 198). Ambiguous keyword searches will return multiple equally-relevant matches, and BM25 alone won't surface the most-validated lesson first.
This feature was external-validated against rohitg00/agentmemory (TS, 11.3k stars) which implements the same shape over SQLite. Their pivot = our axiom (dual-memory operational/crystallized split). Their reinforcement model is the part worth adopting.
Origin task tracked in the knowledge vault: SDD-036e. Vault project entry: 10_projects/hive/11-tasks.md § Feature requests § Lesson reinforcement counter with confidence decay (SDD-036e).
Proposed solution
Add a reinforcements integer counter + decay-grown confidence per lesson, incremented on EVERY read access via any hive tool that surfaces the lesson body.
Algorithm
# On every read access to a specific lesson (id-stable target):lesson.reinforcements+=1lesson.confidence=min(1.0, lesson.confidence+0.1* (1.0-lesson.confidence))
# Asymptotic decay: each touch closes 10% of remaining gap to 1.0# After N touches starting at c0, c_N = 1 - (1-c0) * 0.9^N# c0=0.7 -> c_1=0.73 -> c_5=0.82 -> c_10=0.90 -> c_20=0.97
Storage
Lessons live as markdown sections inside per-project 90-lessons.md. Section-level mutable state is fragile (whitespace edits, manual edits, git auto-backup re-commits). Use a sqlite side-table in hive's existing DB:
CREATETABLElesson_reinforcement (
lesson_id TEXTPRIMARY KEY, -- canonical id (e.g. project + heading slug)
project TEXTNOT NULL,
heading TEXTNOT NULL, -- raw `### [date] title` heading
reinforcements INTEGERNOT NULL DEFAULT 0,
confidence REALNOT NULL DEFAULT 0.7,
first_seen TEXTNOT NULL, -- ISO date of capture
last_referenced TEXT, -- ISO date of last read
UNIQUE (project, heading)
);
Markdown source stays canonical and human-readable; the table is a read-amplification index. Lazy init on first read for pre-existing lessons (no migration step).
Read hooks needed
Each surface that exposes lesson-body text must increment exactly once per query (per-query dedup):
_workers.pycapture_lesson retrieval branch — increment on lookup.
Wherever vault_search lives — when results include a 90-lessons.md hit, parse the heading and increment for that lesson.
vault_query (or equivalent) — same shape when the queried file is a 90-lessons.md.
_workers.pycapture_lesson insert branch — initialize the row at capture (baseline reinforcements=0, confidence=as-provided or 0.7 default).
Ranking integration
New optional param rank_by on vault_search (default unchanged for back-compat):
'bm25' — current behavior.
'reinforcements' — sort by counter desc, ties broken by recency.
'confidence' — sort by decayed confidence desc.
'hybrid' — alpha * bm25 + (1-alpha) * confidence, where alpha defaults to 0.7 (BM25-leaning).
Optionally extend the MCP return shape to include reinforcement metadata so an agent can rank externally if it prefers.
Acceptance criteria
capture_lesson records a new lesson and inserts a baseline lesson_reinforcement row.
vault_search with rank_by='reinforcements' reorders results by counter, ties broken by recency.
A lesson read 5 times has reinforcements=5 and confidence > 0.82 (verifies decay correctness within 0.001 tolerance).
Concurrent test: 10 parallel reads of the same lesson yield final counter = 10 (sqlite WAL + existing vault_write_lock ensure no race).
Pre-existing lessons (no row) default lazily on first read to reinforcements=0, confidence=lesson's frontmatter or 0.7 fallback. No migration step needed.
Backwards compat: existing callers of vault_search without rank_by get unchanged BM25 ordering.
README "Lesson capture" section updated.
CHANGELOG entry for v1.12.X.
Tests added under tests/test_lesson_reinforcement.py (or equivalent).
Why it matters
Without this, the lesson corpus accumulates as a flat index — every captured lesson has equal weight regardless of whether it's a well-validated heuristic referenced dozens of times or a captured-and-never-touched one-shot. The agent can't preferentially surface lessons that have proven useful.
With it, vault_search ranking gains a usage signal independent of recency or keyword density. High-confidence recurrent lessons surface first when relevant; flukes drop. This is the primary mechanism that lets a large lesson corpus stay useful without manual curation.
Effort
~3-4h. Breakdown:
~30 min: sqlite schema + migration check.
~60 min: read hooks across 3 surfaces with per-query dedup.
Pattern doc: 00_meta/patterns/pattern-memory-consolidation.md (currently labels Confidence-decay as "future work" — flip to "implemented" when this lands).
⏳ Retrieval-ranking pressure surfaces in practice (e.g. capture_lesson searches returning >5 matches where the current top result is consistently wrong vs human-ranking). Until feat: open source readiness — README + PyPI metadata #2 manifests, manual curation + [[wikilink]] graph is sufficient — implementing pre-empts pull, which is an anti-pattern in our model.
Background
capture_lessoncurrently records a lesson with a staticconfidencefield set at capture time. Subsequent references viavault_search,vault_query, orcapture_lessonretrieval do NOT update any usage signal. There is no mechanism to distinguish a high-impact recurrent lesson from a one-shot capture after the fact — both rank identically on BM25 keyword match.At small corpus (< ~100 lessons) this is fine. Current state across the knowledge vault: 313 lessons across 18 projects (kubelab alone has 198). Ambiguous keyword searches will return multiple equally-relevant matches, and BM25 alone won't surface the most-validated lesson first.
This feature was external-validated against
rohitg00/agentmemory(TS, 11.3k stars) which implements the same shape over SQLite. Their pivot = our axiom (dual-memory operational/crystallized split). Their reinforcement model is the part worth adopting.Origin task tracked in the knowledge vault: SDD-036e. Vault project entry:
10_projects/hive/11-tasks.md § Feature requests § Lesson reinforcement counter with confidence decay (SDD-036e).Proposed solution
Add a
reinforcementsinteger counter + decay-grown confidence per lesson, incremented on EVERY read access via any hive tool that surfaces the lesson body.Algorithm
Storage
Lessons live as markdown sections inside per-project
90-lessons.md. Section-level mutable state is fragile (whitespace edits, manual edits, git auto-backup re-commits). Use a sqlite side-table in hive's existing DB:Markdown source stays canonical and human-readable; the table is a read-amplification index. Lazy init on first read for pre-existing lessons (no migration step).
Read hooks needed
Each surface that exposes lesson-body text must increment exactly once per query (per-query dedup):
_workers.pycapture_lessonretrieval branch — increment on lookup.vault_searchlives — when results include a90-lessons.mdhit, parse the heading and increment for that lesson.vault_query(or equivalent) — same shape when the queried file is a90-lessons.md._workers.pycapture_lessoninsert branch — initialize the row at capture (baseline reinforcements=0, confidence=as-provided or 0.7 default).Ranking integration
New optional param
rank_byonvault_search(default unchanged for back-compat):'bm25'— current behavior.'reinforcements'— sort by counter desc, ties broken by recency.'confidence'— sort by decayed confidence desc.'hybrid'—alpha * bm25 + (1-alpha) * confidence, where alpha defaults to 0.7 (BM25-leaning).Optionally extend the MCP return shape to include reinforcement metadata so an agent can rank externally if it prefers.
Acceptance criteria
capture_lessonrecords a new lesson and inserts a baselinelesson_reinforcementrow.vault_searchwithrank_by='reinforcements'reorders results by counter, ties broken by recency.reinforcements=5andconfidence > 0.82(verifies decay correctness within 0.001 tolerance).vault_write_lockensure no race).reinforcements=0,confidence=lesson's frontmatteror 0.7 fallback. No migration step needed.vault_searchwithoutrank_byget unchanged BM25 ordering.tests/test_lesson_reinforcement.py(or equivalent).Why it matters
Without this, the lesson corpus accumulates as a flat index — every captured lesson has equal weight regardless of whether it's a well-validated heuristic referenced dozens of times or a captured-and-never-touched one-shot. The agent can't preferentially surface lessons that have proven useful.
With it,
vault_searchranking gains a usage signal independent of recency or keyword density. High-confidence recurrent lessons surface first when relevant; flukes drop. This is the primary mechanism that lets a large lesson corpus stay useful without manual curation.Effort
~3-4h. Breakdown:
rank_byparam + ranking integration.Cross-references
10_projects/knowledge/11-tasks.md § SDD-036e.10_projects/hive/11-tasks.md § Feature requests § Lesson reinforcement counter with confidence decay (SDD-036e).10_projects/knowledge/90-lessons.md § [2026-05-17] agentmemory validation.00_meta/patterns/pattern-memory-consolidation.md(currently labels Confidence-decay as "future work" — flip to "implemented" when this lands).rohitg00/agentmemoryconsolidation-pipeline.ts.Re-activation criterion (per knowledge vault SDD-036e defer)
This feature was deferred-with-criterion in the knowledge vault on 2026-05-18. Conditions to re-activate:
capture_lessonsearches returning >5 matches where the current top result is consistently wrong vs human-ranking). Until feat: open source readiness — README + PyPI metadata #2 manifests, manual curation +[[wikilink]]graph is sufficient — implementing pre-empts pull, which is an anti-pattern in our model.