From 331631f097b2bca67843ba03cda0fee70e34f667 Mon Sep 17 00:00:00 2001 From: Jack Arturo Date: Mon, 2 Mar 2026 21:11:51 +0530 Subject: [PATCH] docs(bench): add PR #73, #80, and #87 experiment results Records evaluation outcomes from initial experiment round: - #73 min_score threshold: neutral (needs #78 for score differentiation) - PR #80 enhanced recall: blocked by merge conflicts, needs rebase - PR #87 write-time dedup: neutral on recall as expected Made-with: Cursor --- benchmarks/EXPERIMENT_LOG.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/benchmarks/EXPERIMENT_LOG.md b/benchmarks/EXPERIMENT_LOG.md index e4e677ea..a384584c 100644 --- a/benchmarks/EXPERIMENT_LOG.md +++ b/benchmarks/EXPERIMENT_LOG.md @@ -20,6 +20,9 @@ on the snapshot-based bench infrastructure (PR #97, merged 2026-03-02). | Date | Issue/PR | Branch | LoCoMo-mini | LoCoMo-full | LME-mini | Notes | |------|----------|--------|-------------|-------------|----------|-------| | 2026-03-02 | baseline | main | 76.97% (234/304) | 80.06% (1590/1986) | -- | Voyage 4, 1024d. Health: DEGRADED (low score variance) | +| 2026-03-02 | #73 | exp/73-min-score-threshold | 76.97% (+0.0) | -- | -- | min_score + adaptive floor. No regression. Needs #78 for impact | +| 2026-03-02 | PR #80 | jescalan/feat/enhanced-recall | BLOCKED | -- | -- | Merge conflicts with main (recall.py), needs rebase before eval | +| 2026-03-02 | PR #87 | jescalan/feat/write-time-dedup | 76.97% (+0.0) | -- | -- | Write-time dedup gate. Neutral on recall (expected) | ## How to add an entry