Skip to content

Auto Invoke

Varun Pratap Bhardwaj edited this page Mar 30, 2026 · 1 revision

Auto-Invoke: Multi-Signal Automatic Memory Surfacing

Auto-Invoke replaces the v3.1 AutoRecall system with a multi-signal scorer that combines four retrieval signals into a single ranking score. Memories surface automatically when context triggers them, gated by a Feeling-of-Knowing (FOK) threshold that rejects noise.

Every stored fact also receives a contextual description --- a one-sentence explanation of WHY this memory would be useful in the future, following the A-MEM pattern from agiresearch/A-mem-sys.

How It Works

When you start a session or issue a recall query, the Auto-Invoke engine:

  1. Gets candidates via VectorStore KNN (or BM25 fallback in Mode A)
  2. Scores each candidate across 4 signals: similarity, recency, frequency, and trust
  3. Applies FOK gating --- rejects any result scoring below 0.12
  4. Enriches results with contextual descriptions and returns ranked memories

The result is injected into your AI's context automatically. You do not need to call any tool manually.

Four-Signal Scoring

Each candidate memory receives a composite score from four independent signals:

score = 0.40 * similarity + 0.25 * recency + 0.20 * frequency + 0.15 * trust
Signal Weight Source Range Cold Start Default
Similarity 0.40 VectorStore KNN cosine distance [0, 1] 0.0 (no embedding)
Recency 0.25 exp(-0.01 * seconds_since_last_access) from fact_access_log [0.01, 1.0] 0.1
Frequency 0.20 log1p(access_count) / log1p(max_access_count) [0, 1] 0.0
Trust 0.15 Bayesian Beta distribution via TrustScorer [0, 1] 0.5 (uniform prior)

Weights are initial heuristics derived from Zep/Hindsight multi-signal ranking consensus. They are configurable, and the existing AdaptiveLearner (LightGBM ranker) learns optimal weights from user feedback after deployment.

Quality target: MRR > 0.6 on a gold-standard fixture, exceeding cosine-only retrieval by at least 10%.

ACT-R Mode (Optional)

As an alternative to the 4-signal model, Auto-Invoke supports a 3-signal mode based on the ACT-R cognitive architecture (Anderson & Lebiere 1998). ACT-R's base-level activation combines recency and frequency into a single cognitively-grounded signal:

B_i = ln( SUM_k (t_k)^(-d) )

Where t_k is the time since the k-th access and d = 0.5 is the power-law decay exponent. The base-level activation is then sigmoid-normalized to [0, 1].

In ACT-R mode, scoring uses three signals:

score = 0.40 * similarity + 0.35 * base_level + 0.25 * trust

This is useful when you want a model that more closely mirrors human memory retrieval patterns, where recently and frequently accessed items have a unified advantage.

FOK Gating (Noise Rejection)

From the SYNAPSE spreading activation research (arXiv 2601.02744). The Feeling-of-Knowing gate is a minimum score threshold (default: 0.12) below which results are rejected entirely.

# Results below this threshold are discarded
gated_results = [r for r in scored_results if r.score >= 0.12]

This prevents low-confidence memories from polluting the AI's context window. The threshold is configurable but should not be set below 0.01.

Contextual Descriptions

Every fact stored in SLM receives a contextual description --- a one-sentence explanation of why this memory would be useful, not just what it says. This follows the A-MEM pattern.

Mode A (rules-based, no LLM):

A deterministic template generates the description from fact metadata:

"This episodic event about React, March 15 records factual information
observed on 2026-03-15 regarding Team decided to use React for the frontend."

Keywords are extracted from entities and content tokens.

Mode B/C (LLM-generated):

An LLM writes a one-sentence explanation focused on what the memory reveals about the person, project, or decision --- not what it literally says. The LLM also extracts 3--5 keywords for improved searchability.

Contextual descriptions are stored in the fact_context table and used by both the Auto-Invoke scorer and the Consolidation engine.

Mode A Degradation

When running in Mode A without an embedding provider (no sentence-transformers installed), the similarity signal is unavailable. Auto-Invoke redistributes weights:

MODE_A_NO_EMBEDDINGS_WEIGHTS = {
    "similarity": 0.00,   # Disabled
    "recency":    0.40,   # Promoted to primary signal
    "frequency":  0.35,   # Promoted
    "trust":      0.25,   # Promoted
}

Candidates are retrieved via BM25 text search instead of vector KNN. All other functionality (FOK gating, contextual descriptions, result enrichment) works identically.

If Mode A has sentence-transformers installed, all four signals activate normally with default weights.

Configuration

Enable Auto-Invoke:

slm config set auto_invoke.enabled true

Switch to ACT-R scoring mode:

slm config set auto_invoke.use_act_r true

Adjust FOK threshold:

slm config set auto_invoke.fok_threshold 0.15

Adjust signal weights (must sum to 1.0):

slm config set auto_invoke.weights.similarity 0.35
slm config set auto_invoke.weights.recency 0.30
slm config set auto_invoke.weights.frequency 0.20
slm config set auto_invoke.weights.trust 0.15

Full Configuration Reference

Parameter Default Description
enabled false Feature flag. Must be true to activate.
weights.similarity 0.40 Weight for embedding cosine similarity
weights.recency 0.25 Weight for time-since-last-access
weights.frequency 0.20 Weight for access count
weights.trust 0.15 Weight for Bayesian trust score
use_act_r false Use 3-signal ACT-R mode instead of 4-signal
act_r_decay 0.5 Power-law decay exponent for ACT-R base-level
fok_threshold 0.12 Minimum score to pass FOK gate
max_memories_injected 10 Maximum memories returned per invocation
include_archived false Include archived/cold facts in results

MCP Tool: smart_recall

V3.2 adds a new smart_recall MCP tool that exposes multi-signal scoring directly. The existing recall tool is unchanged.

{
  "tool": "smart_recall",
  "arguments": {
    "query": "What authentication method does the project use?",
    "max_results": 5,
    "include_context": true,
    "use_act_r": false
  }
}

Response includes per-signal score breakdowns:

{
  "success": true,
  "results": [
    {
      "fact_id": "f_abc123",
      "content": "Project uses JWT tokens with 24-hour expiry...",
      "score": 0.73,
      "signals": {
        "similarity": 0.85,
        "recency": 0.62,
        "frequency": 0.45,
        "trust": 0.80
      },
      "contextual_description": "This decision about JWT authentication records the team's auth strategy chosen for security and session management."
    }
  ],
  "scoring_mode": "4-signal"
}

Hook Taxonomy

Auto-Invoke operates through three distinct hook points (not to be confused with each other):

Hook System Trigger What Runs
Claude Code SessionStart External shell New Claude session session_init MCP -> AutoInvoker.invoke()
Engine pre-hook recall Internal Python Every engine.recall() AutoInvoker enhances query context
Claude Code PostToolUse External shell After every MCP tool slm observe -> auto-capture

Backward Compatibility

  • AutoRecall (v3.1) is preserved and used when auto_invoke.enabled=false (default)
  • The existing recall MCP tool is unchanged
  • session_init returns the same fields, with optional new fields added
  • smart_recall is a new, additive tool --- it does not replace recall

Part of Qualixar | Created by Varun Pratap Bhardwaj

Clone this wiki locally