A reinforcement learning system that uses vector similarity search and memory-based decision making. The project implements an "EngramBrain" that stores experiences in a vector database and uses nearest-neighbor search to make decisions based on similar past experiences.
This project is testing the idea that something like Sheldrakes "morphic resonance", or a memory of nature, can explain the emergence of complex instinctive behaviours. Such behaviours cannot easily be explained through genetics. However, if animal behaviour is affected by the previous behavior of similar animals, then successful behaviours will tend to create more historical instances of that behaviour, making it easier to "resonate" with an animal in the present moment.
Resonant Vectors explores a novel approach to reinforcement learning where:
- Engrams (memory traces) store state-action-outcome triplets
- Vector similarity search finds relevant past experiences
- Resonator vectors encode state information for similarity matching
- Milvus provides fast approximate nearest neighbor search
The system is trained on the LunarLander-v2 environment from Gymnasium, learning to land a spacecraft by recalling and learning from similar past situations.
EngramBrain: The decision-making engine that queries similar experiences and generates actionsEngramStore: Manages storage and retrieval of engrams in MilvusTrainer: Orchestrates training on Gymnasium environmentsWeightedResonatorFactory: Converts input states to resonator vectors
- Observation: The agent receives a state observation from the environment
- Resonator Creation: The state is converted to a resonator vector
- Similarity Search: Milvus finds the nearest engrams to the resonator vector
- Action Selection: Actions are scored based on similar past experiences and their outcomes
- Learning: After each trial, new engrams are created and stored with their outcomes
- Python 3.8+
- Docker and Docker Compose
- Virtual environment (recommended)
- Create a virtual environment:
python -m venv venv- Activate the virtual environment:
source venv/bin/activate # On macOS/Linux
# or
venv\Scripts\activate # On Windows- Install dependencies:
pip install -r requirements.txtMilvus is required as the vector database backend:
wget https://github.com/milvus-io/milvus/releases/download/v2.3.4/milvus-standalone-docker-compose.yml -O docker-compose.yml
docker compose up -dVerify Milvus is running:
docker compose psMilvus will be available at localhost:19530.
Train an agent with default settings:
from Trainer import Trainer
trainer = Trainer("lander3", clear_collection=True)
trainer.train(1000) # Run 1000 trialsParameters:
instance_name: Unique name for the Milvus collectionclear_collection: Whether to reset the collection before training
The main.py file contains example usage:
python main.pyCurrently configured to run the trainer.
Edit settings.py to customize behavior:
STATE_VECTOR_SIZE: Dimension of state vectors (default: 9)OUTPUT_VECTOR_SIZE: Number of possible actions (default: 4)NOISE: Random noise added to action selection (default: 0.1)MIN_RESULTS: Minimum number of similar engrams to retrieve (default: 300)MAX_TRIAL_LENGTH: Maximum steps per trial (default: 400)METABOLIC_COST: Energy cost per step when using hit points (default: 0.2)DISPLAY: Show the environment visualization (default: True)READ_ONLY: Disable learning/engram storage (default: False)DROP_COLLECTION: Drop collection on initialization (default: True)
.
├── main.py # Entry point with example usage
├── EngramBrain.py # Core decision-making system
├── engram.py # Engram data structure and Milvus store
├── Trainer.py # Training orchestration
├── WeightedResonatorFactory.py # State-to-resonator conversion
├── IResonatorFactory.py # Interface for resonator factories
├── settings.py # Configuration parameters
├── hello_milvus.py # Milvus connection test script
├── gym/ # Custom gym environments (if any)
│ ├── lander_environment.py
│ └── lander.py
├── requirements.txt # Python dependencies
└── docker-compose.yml # Milvus configuration
An engram represents a memory trace containing:
vector: The state/resonator vector at the time of the experienceaction: The action takenoutcome: The reward/outcome of that action (normalized to -1 to 1)
A resonator is a transformed version of the input state, optimized for similarity matching. The WeightedResonatorFactory appends success metrics to the state vector.
The system uses L2 (Euclidean) distance in Milvus to find the most similar past experiences. The IVF_FLAT index provides a balance between search speed and accuracy.
Key dependencies include:
pymilvus: Milvus Python clientgymnasium: Reinforcement learning environmentsnumpy: Numerical computationsbox2d-py: Physics engine for LunarLander
See requirements.txt for the complete list.
- The system learns online during training, storing engrams after each trial
- Success metrics are normalized and used to weight engram outcomes
- The feedback queue batches updates for efficiency
- Multiple trainers can use separate Milvus collections for parallel training
[Add your license here]