δΈζ | English
MemFactory is a modular framework for memory processing that integrates full memory lifecycle management with reinforcement learning training. It provides core components for memory extraction, update decisions, storage management, organization, retrieval, and response generation, while also supporting RL-based optimization of memory handling capabilities.
This project provides a standardized and cohesive training and evaluation pipeline built around MemTrainer. Users only need to specify a target Agent and an interaction Env to launch the full training workflow:
- Environment Layer (Envs): Defines the unified data format for task datasets and provides environment feedback, i.e. the reward mechanism in reinforcement learning.
- Execution Layer (Agents): Performs rollouts inside the specified environment and continuously interacts with it to produce trajectories.
- Optimization Layer (Trainer): Receives trajectories generated by the agent and includes policy optimization algorithms such as GRPO (Group Relative Policy Optimization) to iteratively improve model capability.
This framework follows the paradigm of "assembling agents from modules". A complete agent is built by combining core modules with single responsibilities as needed. Supported module types include:
- Extractors: Extract high-value memory snippets from conversations or observations (e.g.
Naive Extractor). - Updaters: Decide how newly extracted information should be merged into or overwrite historical memory.
- Retrievers: Recall relevant information from the memory state according to the current task context (e.g.
Naive Retriever,LRM-Retriever). - Agents: Composite components that combine the capabilities of multiple modules.
Once assembled, an agent can complete the full memory-processing lifecycle.
- Memory Processing Pipeline:
Environment Input -> Extractor (memory extraction) -> Updater (state update) -> Retriever (context retrieval) -> Environment Output. - End-to-End Optimization: With the GRPO training framework, the agent not only performs memory reading and writing, but can also jointly optimize its internal extraction, update, and retrieval policies through sparse or dense reward signals provided by the environment.
From modules to agents to environments, the framework offers substantial flexibility for research and follow-up innovation:
- Multiple Implementations Under Shared Interfaces: Each low-level module can have multiple implementations, ranging from simple rule-based modules to more advanced learnable ones, and can be swapped in and out.
- Extensibility at Every Level: From replacing the logic of low-level Modules, to redesigning mid-level Agents, to customizing top-level Envs and reward rules, the framework exposes clear extension points so researchers can focus on the core algorithm rather than rebuilding infrastructure.
π Environment and Hardware Requirements:
- Python: 3.10 recommended
- GPU: An A800 80G or better is recommended
# Clone the repository
git clone git@github.com:MemTensor/MemFactory.git
cd MemFactory
# Install dependencies
pip install -r requirements.txt
# Configure environment variables
cp .env.example .env
# Edit .env to configure API keys and database connections
# Download training data (optional)We provide reinforcement learning examples under examples/. Currently supported:
(1) Inspired by Memory-R1, optimizing the agent's memory extraction and update policies.
(2) Inspired by MemoryAgent, optimizing the agent's long-context handling capability.
(3) Inspired by RMM, optimizing the agent's memory retrieval policy.
In the empirical study reported in the paper, we focus on MemoryAgent, because the original MemAgent work releases a comprehensive public dataset for both training and evaluation, making it an ideal benchmark for reproducible experiments.
examples/RunMemoryAgent1.7B.sh and examples/RunMemoryAgent4B.sh are the scripts used in the paper to train MemoryAgent on the open-source data released by MemAgent. You can reproduce the reported results with the same scripts:
bash examples/RunMemoryAgent1.7B.sh
bash examples/RunMemoryAgent4B.shThe training data can be downloaded from datas. After downloading, update DATA_PATH and MODEL_PATH in the scripts to match your local environment. Once training is finished, you can use evaluation/evaluate_orchestrator.py to evaluate checkpoints on the three benchmark test sets used in the paper.
Reported results in the paper (avg@4, averaged over 4 independent runs):
| Model | Setting | eval_50 | eval_100 | eval_fwe_16384 | Average |
|---|---|---|---|---|---|
| Qwen3-1.7B | Base checkpoint | 0.4727 | 0.4297 | 0.0332 | 0.3118 |
| Qwen3-1.7B | + MemFactory RL | 0.5684 | 0.4863 | 0.0195 | 0.3581 |
| Qwen3-4B-Instruct | Base checkpoint | 0.6523 | 0.5645 | 0.6270 | 0.6146 |
| Qwen3-4B-Instruct | + MemFactory RL | 0.7051 | 0.6309 | 0.6426 | 0.6595 |
See "π οΈ Developer Guide"
# LLM service configuration
OPENAI_API_KEY=your-openai-key
OPENAI_BASE_URL=https://api.openai.com/v1
LLM_MODEL=gpt-4.1-nano
# SwanLab configuration (optional)
SWANLAB_API_KEY=your-swanlab-key
# Embedding service configuration
EMBEDDING_API_KEY=your-embedding-key
EMBEDDING_BASE_URL=your-embedding-api-url
EMBEDDING_MODEL=bge-m3
EMBEDDING_DIM=1024
# Neo4j graph database configuration
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password
# Milvus vector database configuration
MILVUS_URI=http://localhost:19530
MILVUS_USER=root
MILVUS_PASSWORD=your-password# Custom module
@MODULE_REGISTRY.register("naive_extractor")
class NaiveExtractor(BaseModule):
def __init__(self, tokenizer, device="cuda", **kwargs):
#...
@MODULE_REGISTRY.register("your_extractor")
class YourExtractor(BaseModule):
def __init__(self, tokenizer, device="cuda", **kwargs):
# Custom reward function
@ENV_REGISTRY.register("your_new_env")
class YourNewEnv(MemoryBankEnv):
def compute_reward(self, predictions: Dict[str, List[str]], ground_truths: Dict[str, Any], num_generations: int, **kwargs) -> Dict[str, torch.Tensor]:
# Implement custom reward calculation
pass
# Custom training configuration
# Customize the environment, agent, and training parameters during training.Contributions to MemFactory are welcome.
This project is licensed under the Apache License 2.0.
If you find MemFactory useful for your research, please cite our paper:
@misc{guo2026memfactoryunifiedinference,
title={MemFactory: Unified Inference & Training Framework for Agent Memory},
author={Ziliang Guo and Ziheng Li and Bo Tang and Feiyu Xiong and Zhiyu Li},
year={2026},
eprint={2603.29493},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.29493},
}We thank the open-source community for the excellent tools and libraries, as well as all contributors who have helped improve this project.
