Skip to content

sarthaksolow/LLM-Memory-Implementation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LLM Memory Practice - Short-term & Long-term Memory

This project implements different memory strategies for LLMs using LangGraph and PostgreSQL with vector embeddings.

Project Structure

llm memory/
β”œβ”€β”€ docker-compose.yml          # PostgreSQL with pgvector extension
β”œβ”€β”€ .env                        # Environment variables (API keys, DB config)
β”œβ”€β”€ .gitignore
β”œβ”€β”€ short-term-memory/
β”‚   β”œβ”€β”€ trimming/               # βœ… Strategy 1: Message Trimming
β”‚   β”‚   β”œβ”€β”€ main.py
β”‚   β”‚   β”œβ”€β”€ config.py
β”‚   β”‚   β”œβ”€β”€ database.py
β”‚   β”‚   └── requirements.txt
β”‚   β”œβ”€β”€ summary/                # βœ… Strategy 2: Conversation Summary
β”‚   β”‚   β”œβ”€β”€ main.py
β”‚   β”‚   β”œβ”€β”€ config.py
β”‚   β”‚   β”œβ”€β”€ database.py
β”‚   β”‚   └── requirements.txt
β”‚   └── user-progress/          # πŸ”œ Strategy 3: User Progress Tracking
β”‚       └── (to be implemented)
└── long-term-memory/           # βœ… Long-term Memory with Semantic Search
    β”œβ”€β”€ main.py                 # LangGraph implementation
    β”œβ”€β”€ config.py               # Configuration
    β”œβ”€β”€ database.py             # PostgreSQL + pgvector operations
    β”œβ”€β”€ embeddings.py           # Embedding generation (sentence-transformers)
    β”œβ”€β”€ memory_extractor.py     # LLM-based memory extraction
    β”œβ”€β”€ memory_manager.py       # Memory lifecycle management
    β”œβ”€β”€ context_builder.py      # Context window assembly
    └── requirements.txt

Setup Instructions

1. Configure Environment Variables

Edit the .env file and add your OpenRouter API key:

OPENROUTER_API_KEY=your_actual_api_key_here
MODEL_NAME=meta-llama/llama-3.1-8b-instruct

2. Start PostgreSQL with pgvector

IMPORTANT: We now use ankane/pgvector image for semantic search support!

# Navigate to project folder
cd "D:\llm memory"

# Stop old container if running
docker-compose down -v

# Start PostgreSQL container with pgvector
docker-compose up -d

# Check if container is running
docker ps

# View logs (optional)
docker logs llm_memory_postgres

3. Install Python Dependencies

Each strategy has its own requirements.txt:

# For long-term memory (recommended to start here)
cd "long-term-memory"
pip install -r requirements.txt

# For trimming strategy
cd "short-term-memory\trimming"
pip install -r requirements.txt

# For summary strategy
cd "short-term-memory\summary"
pip install -r requirements.txt

Note: Long-term memory requires additional packages:

  • sentence-transformers - For embedding generation (~400MB download first time)
  • pgvector - PostgreSQL vector extension support
  • numpy - Numerical operations

Recommended: Use a virtual environment:

python -m venv venv
venv\Scripts\activate  # On Windows
pip install -r requirements.txt

Long-term Memory (LTM) - The Main Implementation 🧠

How it works: A complete LTM system with semantic search, memory extraction, and intelligent retrieval.

Architecture Flow:

User Input
    ↓
1. LTM Search (Semantic) β†’ Retrieve relevant memories
    ↓
2. STM Retrieval β†’ Get recent conversation
    ↓
3. Context Assembly β†’ Combine LTM + STM + System Prompt
    ↓
4. LLM Generation β†’ Generate response
    ↓
5. Memory Extraction β†’ Store new important info in LTM

The 4 Steps of LTM:

1. CREATE - Memory Extraction

  • Uses LLM to analyze conversations
  • Extracts only important information worth remembering
  • Categorizes: personal_info, preference, fact, decision, goal
  • Assigns importance score (1-10)

Example:

User: "My name is Sarah and I'm building a weather app"
↓
Extracted Memory:
{
  "content": "User's name is Sarah. Working on weather app project.",
  "memory_type": "personal_info",
  "importance": 8
}

2. STORE - Save with Embeddings

  • Converts memory text to 384-dimensional vector embedding
  • Stores in PostgreSQL with pgvector extension
  • Enables semantic search (not just keyword matching)

Database Schema:

ltm_memories
β”œβ”€β”€ id (primary key)
β”œβ”€β”€ user_id (indexed)
β”œβ”€β”€ content (text)
β”œβ”€β”€ memory_type (personal_info/preference/fact/decision/goal)
β”œβ”€β”€ importance (1-10)
β”œβ”€β”€ embedding (VECTOR(384)) -- Semantic embedding
β”œβ”€β”€ created_at
β”œβ”€β”€ last_accessed
└── access_count

3. SEARCH - Semantic Retrieval

  • Generates embedding for user's current query
  • Uses cosine similarity to find relevant memories
  • Returns top K memories above similarity threshold

Example:

Query: "What was I working on?"
↓
Embedding: [0.23, -0.45, 0.67, ...]
↓
Search LTM using cosine similarity
↓
Found: "User's name is Sarah. Working on weather app project."
Similarity: 0.89

4. RETRIEVE - Relevance Weighting

  • Combines similarity score + importance
  • Formula: relevance = (similarity Γ— 0.7) + (importance/10 Γ— 0.3)
  • Updates access tracking (count, last_accessed)

Run Long-term Memory:

cd "long-term-memory"
python main.py

Commands:

  • Type your message to chat
  • memories - View all stored memories
  • stats - View memory statistics
  • clear - Clear all data (STM + LTM)
  • quit - Exit

Example Session:

You: Hi, my name is Alex and I love Python programming

πŸ” Searching long-term memories...
   No relevant memories found

🧠 Extracting memories...
   βœ… Memory created: [personal_info] User's name is Alex. Enjoys Python programming.
   Importance: 8/10

πŸ€– Assistant: Hello Alex! It's great to meet you! Python is an excellent language...

---

(Later in conversation...)

You: What programming languages do I like?

πŸ” Searching long-term memories...
   βœ… Found 1 relevant memories:
   1. [personal_info] User's name is Alex. Enjoys Python programming.
      Relevance: 0.91 (similarity: 0.94, importance: 8/10)

πŸ€– Assistant: Based on what you've told me, you love Python programming!

Configuration Options:

# config.py - Customize these settings

STM_LIMIT = 10                    # Recent messages to keep
TOP_K_MEMORIES = 5                # Max memories to retrieve
MIN_SIMILARITY = 0.7              # Minimum similarity threshold

SIMILARITY_WEIGHT = 0.7           # Weight for semantic similarity
IMPORTANCE_WEIGHT = 0.3           # Weight for memory importance

HIGH_RELEVANCE_THRESHOLD = 0.85   # High emphasis threshold
MEDIUM_RELEVANCE_THRESHOLD = 0.70 # Medium emphasis threshold

Key Features:

  • βœ… Semantic search - Finds relevant memories by meaning, not keywords
  • βœ… Smart extraction - LLM decides what's worth remembering
  • βœ… Relevance weighting - Balances similarity and importance
  • βœ… Access tracking - Tracks which memories are most useful
  • βœ… Multi-user support - Each user has their own memories
  • βœ… Persistent storage - Memories survive across sessions

Short-term Memory Strategies

Strategy 1: Trimming βœ‚οΈ

How it works:

  • Keeps only the last N messages (default: 10)
  • Automatically deletes older messages from PostgreSQL
  • Simple sliding window approach

Database Schema:

messages_trimming
β”œβ”€β”€ id (primary key)
β”œβ”€β”€ session_id (indexed)
β”œβ”€β”€ role (user/assistant)
β”œβ”€β”€ content (text)
└── timestamp

Run:

cd "short-term-memory\trimming"
python main.py

Best for: Short, casual conversations where old context isn't needed


Strategy 2: Summary 🧠

How it works:

  • Keeps only recent N messages (default: 10) in raw form
  • When limit exceeded, summarizes oldest K messages (default: 5)
  • Stores summary separately and deletes summarized messages
  • Summary evolves as conversation grows

Database Schema:

messages_summary
β”œβ”€β”€ id, session_id, role, content, timestamp

conversation_summary
β”œβ”€β”€ session_id (primary key)
β”œβ”€β”€ summary (text)
└── updated_at

Run:

cd "short-term-memory\summary"
python main.py

Best for: Long conversations where historical context matters


Comparison: All Memory Strategies

Aspect Trimming βœ‚οΈ Summary 🧠 Long-term Memory πŸ’Ύ
Storage PostgreSQL PostgreSQL PostgreSQL + Vectors
Old messages Deleted Summarized Extracted as memories
Context retention Lost Condensed Semantically searchable
Token usage Low Medium Medium-High
Information loss High Low Very Low
Setup complexity Simple Moderate Complex
Cross-session memory No No Yes
Semantic search No No Yes
Best for Casual chats Long conversations Personal assistants

Docker Management

Start PostgreSQL:

docker-compose up -d

Stop PostgreSQL:

docker-compose down

Stop and remove all data:

docker-compose down -v

View logs:

docker logs llm_memory_postgres

Access PostgreSQL CLI:

docker exec -it llm_memory_postgres psql -U llm_user -d llm_memory_db

Useful SQL queries:

-- Check pgvector extension
SELECT * FROM pg_extension WHERE extname = 'vector';

-- View long-term memories
SELECT id, user_id, memory_type, importance, content, access_count 
FROM ltm_memories 
ORDER BY created_at DESC;

-- View STM messages
SELECT * FROM stm_messages ORDER BY timestamp DESC LIMIT 20;

-- Count memories by type
SELECT memory_type, COUNT(*) 
FROM ltm_memories 
GROUP BY memory_type;

-- Most accessed memories
SELECT content, access_count, last_accessed 
FROM ltm_memories 
ORDER BY access_count DESC 
LIMIT 10;

How It All Works Together

Connection Flow (Python β†’ Docker PostgreSQL):

Python Script (main.py)
    ↓
Connects to: localhost:5432
    ↓
Docker Port Mapping (5432:5432)
    ↓
Container Port 5432
    ↓
PostgreSQL with pgvector

Your Python code connects to localhost:5432, and Docker transparently forwards it to the container!

Memory Lifecycle:

1. User sends message
    ↓
2. Search LTM for relevant memories (semantic search)
    ↓
3. Retrieve recent STM messages
    ↓
4. Build context: System + LTM + STM
    ↓
5. Send to LLM
    ↓
6. Get response
    ↓
7. Extract new memories (if important info)
    ↓
8. Store in LTM with embeddings
    ↓
9. Store message in STM

Troubleshooting

PostgreSQL connection error:

  • Check Docker: docker ps
  • Verify pgvector image: Should show ankane/pgvector
  • Check port 5432: netstat -ano | findstr :5432
  • Restart: docker-compose down && docker-compose up -d

Embedding model download issues:

  • First run downloads all-MiniLM-L6-v2 (~400MB)
  • Requires internet connection
  • Downloads to ~/.cache/torch/sentence_transformers/

pgvector extension error:

  • Make sure you're using ankane/pgvector image (not postgres:15-alpine)
  • Run: docker-compose down -v && docker-compose up -d
  • Check extension: docker exec -it llm_memory_postgres psql -U llm_user -d llm_memory_db -c "SELECT * FROM pg_extension WHERE extname = 'vector';"

Import errors:

  • Install all requirements: pip install -r requirements.txt
  • Activate virtual environment if using one
  • For pgvector package issues, try: pip install pgvector --upgrade

Learning Outcomes

By completing this project, you'll understand:

  • βœ… Short-term vs Long-term memory in LLMs
  • βœ… Semantic search with vector embeddings
  • βœ… PostgreSQL pgvector extension
  • βœ… LLM-based information extraction
  • βœ… Context window management
  • βœ… LangGraph for stateful AI applications
  • βœ… Docker for development databases
  • βœ… Memory relevance weighting and retrieval

Next Steps

  1. βœ… Test trimming strategy
  2. βœ… Test summary strategy
  3. βœ… Test long-term memory (START HERE!)
  4. πŸ”œ Implement user-progress tracking
  5. πŸ”œ Add memory consolidation (merge similar memories)
  6. πŸ”œ Implement memory decay/forgetting
  7. πŸ”œ Add importance auto-adjustment based on access patterns

Resources


License

This is a practice project for learning purposes. Feel free to modify and experiment!

Happy Learning! πŸš€πŸ§ 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages