recall: tags act as score booster rather than scope filter, causing high-importance off-topic memories to dominate

## Problem

When a user passes `tags: ["project-name"]` to the recall API, the intent is **scoping** — "search within this context." But the current implementation treats tags as a **score component**, meaning high-importance tag-matched memories can outrank topically relevant results even when they're not about the query subject.

### Example

```
# Query: "Forge chat app shipped"
# tags: ["flint"]
```

Expected: Results about the Forge chat app, filtered to Flint-related context.  
Actual: High-importance `flint`-tagged memories (e.g. "critical lessons" with importance 0.9+) outscore vector matches for "Forge", returning irrelevant but tag-matched content.

Removing the tag filter entirely returns 4/5 Forge-relevant results — confirming vector search works correctly, but tag scoring is overriding it.

## Root cause

In the scoring pipeline, tag matches contribute a score boost that compounds with importance weighting. For memories tagged with a frequently-used project tag and high importance scores, this boost can exceed the vector similarity contribution of genuinely relevant but lower-importance memories.

## Two design options

### Option A: Tags as hard pre-filter (recommended for explicit tag queries)

Filter the candidate pool to tag-matched memories **first**, then run vector search within that subset. Tags become a scope limiter, not a score component.

- ✅ Matches user intent when tags are explicitly passed
- ✅ Simple to implement
- ⚠️ Requires consistent tagging — if coverage is sparse, valid results get excluded

### Option B: Tags as soft re-rank with a relevance floor

Keep tag contribution in scoring, but require a minimum vector similarity threshold before tag weight applies. A memory must be at least somewhat topically relevant before tags can boost its rank.

- ✅ More forgiving of inconsistent tagging
- ✅ Gracefully degrades when no tag-matched results are relevant
- ⚠️ More complex, harder to reason about scoring behaviour

## Recommendation

Option A for **explicit** tag queries (user-passed `tags` parameter). Option B for **implicit/ambient** tag context (system-injected context tags). These could map to separate parameters: `tags` (hard pre-filter) vs `context_tags` (soft boost) — which is actually already the documented intent of those two params, just not implemented that way.

## Related

- Surfaced during debugging session with Jason Coleman and Flint ([@flintfromthebasement](https://github.com/flintfromthebasement))
- The keyword scoring fix in #128 exposed this as a separate, deeper architectural issue
- Noted as out of scope for PR fixing #128 — separate PR warranted

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recall: tags act as score booster rather than scope filter, causing high-importance off-topic memories to dominate #130

Problem

Example

Root cause

Two design options

Option A: Tags as hard pre-filter (recommended for explicit tag queries)

Option B: Tags as soft re-rank with a relevance floor

Recommendation

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

recall: tags act as score booster rather than scope filter, causing high-importance off-topic memories to dominate #130

Description

Problem

Example

Root cause

Two design options

Option A: Tags as hard pre-filter (recommended for explicit tag queries)

Option B: Tags as soft re-rank with a relevance floor

Recommendation

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions