-
-
Notifications
You must be signed in to change notification settings - Fork 9
Auto Invoke
Auto-Invoke replaces the v3.1 AutoRecall system with a multi-signal scorer that combines four retrieval signals into a single ranking score. Memories surface automatically when context triggers them, gated by a Feeling-of-Knowing (FOK) threshold that rejects noise.
Every stored fact also receives a contextual description --- a one-sentence explanation of WHY this memory would be useful in the future, following the A-MEM pattern from agiresearch/A-mem-sys.
When you start a session or issue a recall query, the Auto-Invoke engine:
- Gets candidates via VectorStore KNN (or BM25 fallback in Mode A)
- Scores each candidate across 4 signals: similarity, recency, frequency, and trust
- Applies FOK gating --- rejects any result scoring below 0.12
- Enriches results with contextual descriptions and returns ranked memories
The result is injected into your AI's context automatically. You do not need to call any tool manually.
Each candidate memory receives a composite score from four independent signals:
score = 0.40 * similarity + 0.25 * recency + 0.20 * frequency + 0.15 * trust
| Signal | Weight | Source | Range | Cold Start Default |
|---|---|---|---|---|
| Similarity | 0.40 | VectorStore KNN cosine distance | [0, 1] | 0.0 (no embedding) |
| Recency | 0.25 |
exp(-0.01 * seconds_since_last_access) from fact_access_log
|
[0.01, 1.0] | 0.1 |
| Frequency | 0.20 | log1p(access_count) / log1p(max_access_count) |
[0, 1] | 0.0 |
| Trust | 0.15 | Bayesian Beta distribution via TrustScorer
|
[0, 1] | 0.5 (uniform prior) |
Weights are initial heuristics derived from Zep/Hindsight multi-signal ranking consensus. They are configurable, and the existing AdaptiveLearner (LightGBM ranker) learns optimal weights from user feedback after deployment.
Quality target: MRR > 0.6 on a gold-standard fixture, exceeding cosine-only retrieval by at least 10%.
As an alternative to the 4-signal model, Auto-Invoke supports a 3-signal mode based on the ACT-R cognitive architecture (Anderson & Lebiere 1998). ACT-R's base-level activation combines recency and frequency into a single cognitively-grounded signal:
B_i = ln( SUM_k (t_k)^(-d) )
Where t_k is the time since the k-th access and d = 0.5 is the power-law decay exponent. The base-level activation is then sigmoid-normalized to [0, 1].
In ACT-R mode, scoring uses three signals:
score = 0.40 * similarity + 0.35 * base_level + 0.25 * trust
This is useful when you want a model that more closely mirrors human memory retrieval patterns, where recently and frequently accessed items have a unified advantage.
From the SYNAPSE spreading activation research (arXiv 2601.02744). The Feeling-of-Knowing gate is a minimum score threshold (default: 0.12) below which results are rejected entirely.
# Results below this threshold are discarded
gated_results = [r for r in scored_results if r.score >= 0.12]This prevents low-confidence memories from polluting the AI's context window. The threshold is configurable but should not be set below 0.01.
Every fact stored in SLM receives a contextual description --- a one-sentence explanation of why this memory would be useful, not just what it says. This follows the A-MEM pattern.
Mode A (rules-based, no LLM):
A deterministic template generates the description from fact metadata:
"This episodic event about React, March 15 records factual information
observed on 2026-03-15 regarding Team decided to use React for the frontend."
Keywords are extracted from entities and content tokens.
Mode B/C (LLM-generated):
An LLM writes a one-sentence explanation focused on what the memory reveals about the person, project, or decision --- not what it literally says. The LLM also extracts 3--5 keywords for improved searchability.
Contextual descriptions are stored in the fact_context table and used by both the Auto-Invoke scorer and the Consolidation engine.
When running in Mode A without an embedding provider (no sentence-transformers installed), the similarity signal is unavailable. Auto-Invoke redistributes weights:
MODE_A_NO_EMBEDDINGS_WEIGHTS = {
"similarity": 0.00, # Disabled
"recency": 0.40, # Promoted to primary signal
"frequency": 0.35, # Promoted
"trust": 0.25, # Promoted
}Candidates are retrieved via BM25 text search instead of vector KNN. All other functionality (FOK gating, contextual descriptions, result enrichment) works identically.
If Mode A has sentence-transformers installed, all four signals activate normally with default weights.
Enable Auto-Invoke:
slm config set auto_invoke.enabled trueSwitch to ACT-R scoring mode:
slm config set auto_invoke.use_act_r trueAdjust FOK threshold:
slm config set auto_invoke.fok_threshold 0.15Adjust signal weights (must sum to 1.0):
slm config set auto_invoke.weights.similarity 0.35
slm config set auto_invoke.weights.recency 0.30
slm config set auto_invoke.weights.frequency 0.20
slm config set auto_invoke.weights.trust 0.15| Parameter | Default | Description |
|---|---|---|
enabled |
false |
Feature flag. Must be true to activate. |
weights.similarity |
0.40 |
Weight for embedding cosine similarity |
weights.recency |
0.25 |
Weight for time-since-last-access |
weights.frequency |
0.20 |
Weight for access count |
weights.trust |
0.15 |
Weight for Bayesian trust score |
use_act_r |
false |
Use 3-signal ACT-R mode instead of 4-signal |
act_r_decay |
0.5 |
Power-law decay exponent for ACT-R base-level |
fok_threshold |
0.12 |
Minimum score to pass FOK gate |
max_memories_injected |
10 |
Maximum memories returned per invocation |
include_archived |
false |
Include archived/cold facts in results |
V3.2 adds a new smart_recall MCP tool that exposes multi-signal scoring directly. The existing recall tool is unchanged.
{
"tool": "smart_recall",
"arguments": {
"query": "What authentication method does the project use?",
"max_results": 5,
"include_context": true,
"use_act_r": false
}
}Response includes per-signal score breakdowns:
{
"success": true,
"results": [
{
"fact_id": "f_abc123",
"content": "Project uses JWT tokens with 24-hour expiry...",
"score": 0.73,
"signals": {
"similarity": 0.85,
"recency": 0.62,
"frequency": 0.45,
"trust": 0.80
},
"contextual_description": "This decision about JWT authentication records the team's auth strategy chosen for security and session management."
}
],
"scoring_mode": "4-signal"
}Auto-Invoke operates through three distinct hook points (not to be confused with each other):
| Hook | System | Trigger | What Runs |
|---|---|---|---|
| Claude Code SessionStart | External shell | New Claude session |
session_init MCP -> AutoInvoker.invoke()
|
Engine pre-hook recall
|
Internal Python | Every engine.recall()
|
AutoInvoker enhances query context |
| Claude Code PostToolUse | External shell | After every MCP tool |
slm observe -> auto-capture |
-
AutoRecall(v3.1) is preserved and used whenauto_invoke.enabled=false(default) - The existing
recallMCP tool is unchanged -
session_initreturns the same fields, with optional new fields added -
smart_recallis a new, additive tool --- it does not replacerecall
Part of Qualixar | Created by Varun Pratap Bhardwaj
SuperLocalMemory V3 — Your AI Finally Remembers You. 100% local. 100% private. 100% free.
Part of Qualixar | Created by Varun Pratap Bhardwaj | GitHub
SuperLocalMemory V3
Getting Started
Reference
Architecture
Enterprise
Release Notes
V2 Documentation