docs: fix drift in Hybrid Search by jack-arturo · Pull Request #165 · verygoodplugins/automem-website

jack-arturo · 2026-06-08T23:15:34Z

Claim → current state → fix

#	Claim in docs	Current state in code	Fix
1	Entity expansion step 2: "extract entities using `extract_entities()`"	`extract_entities()` does not exist anywhere in the codebase. The recall pipeline calls `_extract_entities_from_results()` (`automem/api/recall.py`) which reads existing `entity:*` tags from seed-result metadata — it does not run a fresh NER pass	Replaced `extract_entities()` with `_extract_entities_from_results()` and clarified that it reads existing entity tags
2	Entity expansion step 3: "Convert extracted entities to `entity:<type>:<slug>` tags"	Entity names are already stored as `entity:` tags; the function extracts the names* from those tags and then converts them to search tag patterns (e.g. `"Sarah"` → `entity:person:sarah`); the old wording implied fresh tag creation	Updated to "Convert extracted entity names to `entity:<type>:<slug>` search tags"

Verified against: automem@ed36b98e3e1569dde71aa430417b6549520f7068 (automem/api/recall.py — _extract_entities_from_results(), _expand_entity_memories())

Generated by Claude Code

Entity expansion step 2 referenced `extract_entities()` which does not exist. The actual function called by the recall pipeline is `_extract_entities_from_results()` (automem/api/recall.py), which reads existing `entity:*` tags from seed-result metadata rather than running a fresh NER pass; step 3 description updated accordingly. Verified against: automem@ed36b98e3e1569dde71aa430417b6549520f7068

Copilot

Pull request overview

Updates the Hybrid Search documentation to accurately describe how entity expansion works in the recall pipeline, aligning the narrative with the implementation details referenced in the PR description.

Changes:

Replaces the non-existent extract_entities() reference with _extract_entities_from_results() and clarifies it reads existing entity:* tags.
Clarifies that extracted entity names are converted into entity:<type>:<slug> search tags (not newly created entity tags).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jack-arturo · 2026-06-08T23:18:41Z

+2. For each seed result, extract entity names from existing `entity:*` tags using `_extract_entities_from_results()`
+3. Convert extracted entity names to `entity:<type>:<slug>` search tags


extract_entities() does exist — it's defined in automem/utils/entity_extraction.py and is used by the enrichment pipeline to run spaCy NLP on newly stored memory content. The references in overview.md (rows/diagram for "Enrichment Pipeline") and architecture/enrichment.md (enrichment worker sequence) describe that flow correctly and need no change.

This PR fixes a different path: entity expansion during recall. When a user calls /recall?expand_entities=true, the recall pipeline reads back entity tags already stored on memories (created earlier by the enrichment step) via _extract_entities_from_results() — it does not call the NLP extractor again. The old extract_entities() reference in hybrid-search.md was conflating these two operations.

Generated by Claude Code

cloudflare-workers-and-pages · 2026-06-08T23:18:52Z

Deploying automem-website with Cloudflare Pages

Latest commit:	`6425111`
Status:	✅ Deploy successful!
Preview URL:	https://962fe5cb.automem-website.pages.dev
Branch Preview URL:	https://docs-audit-hybrid-search-202-sgyh.automem-website.pages.dev

View logs

Copilot AI review requested due to automatic review settings June 8, 2026 23:15

Copilot started reviewing on behalf of jack-arturo June 8, 2026 23:15 View session

jack-arturo enabled auto-merge June 8, 2026 23:15

Copilot AI reviewed Jun 8, 2026

View reviewed changes

jack-arturo added this pull request to the merge queue Jun 8, 2026

Merged via the queue into main with commit 24024d5 Jun 8, 2026
5 checks passed

jack-arturo deleted the docs/audit-hybrid-search-20260608 branch June 8, 2026 23:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: fix drift in Hybrid Search#165

docs: fix drift in Hybrid Search#165
jack-arturo merged 1 commit into
mainfrom
docs/audit-hybrid-search-20260608

jack-arturo commented Jun 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

jack-arturo Jun 8, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		2. For each seed result, extract entity names from existing `entity:*` tags using `_extract_entities_from_results()`
		3. Convert extracted entity names to `entity:<type>:<slug>` search tags

Conversation

jack-arturo commented Jun 8, 2026

Claim → current state → fix

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

jack-arturo Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 8, 2026

Deploying automem-website with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants