-
Notifications
You must be signed in to change notification settings - Fork 15
Closed
Description
Summary
The chat agent currently uses a tool-based approach for memory - the LLM must explicitly call add_memory and search_memory tools. This gives users control (LLM asks permission before storing), but misses the automatic fact extraction that memory-proxy provides.
Current Behavior
- Memory tools are available to the LLM
- LLM decides when to store (and asks user permission per system prompt)
- No automatic memory retrieval injected into context
- User doesn't see when memories are stored/retrieved
Proposed Enhancement
Add an --auto-memory flag (or similar) that enables the full memory-proxy pipeline:
- Auto-retrieve: Inject relevant memories into LLM context before each turn
- Auto-extract: Extract and store facts from each conversation turn
- Visual feedback: Show user what memories were retrieved/stored
Possible Modes
| Mode | Retrieval | Storage | Use Case |
|---|---|---|---|
--no-memory |
None | None | Privacy, testing |
--memory (current) |
Tool-based | Tool-based | User control |
--auto-memory |
Automatic | Automatic | Seamless experience |
--auto-memory-retrieve |
Automatic | Tool-based | Hybrid (best of both?) |
Implementation Notes
- Could leverage
memory_client.chat()which already implements the full pipeline - Or call
augment_chat_request()+extract_and_store_facts_and_summaries()directly - Need to consider: should auto-mode still respect the "ask permission" guideline?
Related
- PR feat(chat): integrate vector-backed memory system with LLM tools #183 - Current tool-based memory integration
agent_cli/memory/engine.py-process_chat_request()has the full pipeline
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels