Skip to content

feat(recall): add metadata sidecar search#177

Merged
jack-arturo merged 4 commits into
mainfrom
feat/metadata-sidecar-search
Jun 10, 2026
Merged

feat(recall): add metadata sidecar search#177
jack-arturo merged 4 commits into
mainfrom
feat/metadata-sidecar-search

Conversation

@jack-arturo

@jack-arturo jack-arturo commented Jun 10, 2026

Copy link
Copy Markdown
Member

Summary

  • Add a bounded evidence-based metadata sidecar candidate channel to /recall, controlled by RECALL_METADATA_SEARCH_ENABLED.
  • Search whitelisted metadata values without replacing content vectors, generating production tags, mutating graph content, or changing the HTTP API.
  • Use field words only for scoring/disambiguation; this is not keyword-intent routing.
  • Document metadata storage, recall response shape, update/enrichment behavior, consolidation notes, and search behavior for issue End-to-end metadata audit: storage, consolidation, search, and recall #110.

Breaking Changes

None.

Validation

  • .venv/bin/python -m pytest -q -> 465 passed, 12 skipped.
  • Real-data eval: data/sweep_runs/20260610T174320Z-metadata-server-metadata-search.
  • Metadata hit@5: 0.140 -> 0.770; MRR: 0.126 -> 0.672.
  • Qdrant vectors stayed byte-identical: 9,921 baseline and 9,921 candidate vectors.
  • Preserve/mixed suite had zero non-OK statuses.

Eval harness PR: verygoodplugins/automem-evals#10

Closes #110

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a bounded “metadata sidecar” candidate channel to /recall, allowing recall ranking to consider whitelisted memory metadata values (without mutating stored vectors/content) and documenting the end-to-end metadata behavior.

Changes:

  • Introduces graph-side metadata candidate search and wires it into recall behind RECALL_METADATA_SEARCH_ENABLED.
  • Extends scoring to account for match_type="metadata" via SEARCH_WEIGHT_METADATA.
  • Adds tests and documentation for metadata lifecycle and recall-side behavior.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test_metadata_recall_search.py Adds coverage for metadata sidecar admission, whitelisting rules, and scoring behavior.
docs/METADATA_BEHAVIOR.md Documents current metadata storage/recall/update/enrichment/consolidation behavior and new sidecar search rules.
docs/ENVIRONMENT_VARIABLES.md Documents RECALL_METADATA_SEARCH_ENABLED env var for enabling/disabling the sidecar channel.
docs/API.md Notes metadata sidecar recall behavior and the controlling env var.
automem/utils/scoring.py Adds a metadata score component and includes it in final scoring.
automem/search/runtime_recall_helpers.py Implements metadata parsing/tokenization/matching and graph query for metadata sidecar candidates.
automem/runtime_wiring.py Wires the metadata search function into blueprint registration.
automem/config.py Adds RECALL_METADATA_SEARCH_ENABLED and SEARCH_WEIGHT_METADATA configuration.
automem/api/runtime_recall_routes.py Passes the metadata search function through the runtime recall route DI.
automem/api/runtime_bootstrap.py Threads metadata search function into blueprint registration.
automem/api/recall.py Runs metadata sidecar search during recall and incorporates results into scoring/ranking.
app.py Exposes _metadata_keyword_search as a runtime attribute for wiring.

Comment thread automem/search/runtime_recall_helpers.py
Comment thread automem/search/runtime_recall_helpers.py Outdated
Comment thread automem/config.py
jack-arturo and others added 3 commits June 10, 2026 20:52
- Fetch relations only for kept metadata candidates (at most limit)
  instead of every admitted row (up to the scan limit) on each recall
- Drop the subsumed branch in _metadata_value_has_strong_evidence and
  the unused include_people parameter; mark METADATA_SKIP_FIELDS as a
  whitelist safety guard
- Document SEARCH_WEIGHT_METADATA in ENVIRONMENT_VARIABLES.md and
  CLAUDE.md so operators can discover and lab-sweep it
- Add regression test bounding fetch_relations calls to returned results

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
CI pins black 24.4.2 and isort 5.13.2 via pre-commit; local black 24.8.0
formats these two spots differently. Reformatted with the pinned hooks.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jack-arturo jack-arturo merged commit 4e7956e into main Jun 10, 2026
7 checks passed
@jack-arturo jack-arturo deleted the feat/metadata-sidecar-search branch June 10, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

End-to-end metadata audit: storage, consolidation, search, and recall

2 participants