Skip to content

MemTensor/MemFactory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

75 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MemFactory - A Framework for Memory Processing and Reinforcement Learning

δΈ­ζ–‡ | English

Paper: arxiv | PDF

Overview

MemFactory is a modular framework for memory processing that integrates full memory lifecycle management with reinforcement learning training. It provides core components for memory extraction, update decisions, storage management, organization, retrieval, and response generation, while also supporting RL-based optimization of memory handling capabilities.

πŸ—οΈ Framework Architecture

Core Components

logo

πŸ”§ Key Features

1. Unified RL Training and Evaluation Framework

This project provides a standardized and cohesive training and evaluation pipeline built around MemTrainer. Users only need to specify a target Agent and an interaction Env to launch the full training workflow:

  • Environment Layer (Envs): Defines the unified data format for task datasets and provides environment feedback, i.e. the reward mechanism in reinforcement learning.
  • Execution Layer (Agents): Performs rollouts inside the specified environment and continuously interacts with it to produce trajectories.
  • Optimization Layer (Trainer): Receives trajectories generated by the agent and includes policy optimization algorithms such as GRPO (Group Relative Policy Optimization) to iteratively improve model capability.

2. Lego-Style Modular Agent Architecture

This framework follows the paradigm of "assembling agents from modules". A complete agent is built by combining core modules with single responsibilities as needed. Supported module types include:

  • Extractors: Extract high-value memory snippets from conversations or observations (e.g. Naive Extractor).
  • Updaters: Decide how newly extracted information should be merged into or overwrite historical memory.
  • Retrievers: Recall relevant information from the memory state according to the current task context (e.g. Naive Retriever, LRM-Retriever).
  • Agents: Composite components that combine the capabilities of multiple modules.

3. End-to-End Memory Processing and Policy Evolution

Once assembled, an agent can complete the full memory-processing lifecycle.

  • Memory Processing Pipeline: Environment Input -> Extractor (memory extraction) -> Updater (state update) -> Retriever (context retrieval) -> Environment Output.
  • End-to-End Optimization: With the GRPO training framework, the agent not only performs memory reading and writing, but can also jointly optimize its internal extraction, update, and retrieval policies through sparse or dense reward signals provided by the environment.

4. High Customizability for Further Research

From modules to agents to environments, the framework offers substantial flexibility for research and follow-up innovation:

  • Multiple Implementations Under Shared Interfaces: Each low-level module can have multiple implementations, ranging from simple rule-based modules to more advanced learnable ones, and can be swapped in and out.
  • Extensibility at Every Level: From replacing the logic of low-level Modules, to redesigning mid-level Agents, to customizing top-level Envs and reward rules, the framework exposes clear extension points so researchers can focus on the core algorithm rather than rebuilding infrastructure.

πŸš€ Quick Start

1. Installation and Environment Setup

πŸ“Œ Environment and Hardware Requirements:

  • Python: 3.10 recommended
  • GPU: An A800 80G or better is recommended
# Clone the repository
git clone git@github.com:MemTensor/MemFactory.git
cd MemFactory

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env
# Edit .env to configure API keys and database connections

# Download training data (optional)

2. Launch RL Training

We provide reinforcement learning examples under examples/. Currently supported:

(1) Inspired by Memory-R1, optimizing the agent's memory extraction and update policies.

(2) Inspired by MemoryAgent, optimizing the agent's long-context handling capability.

(3) Inspired by RMM, optimizing the agent's memory retrieval policy.

Reproducing the Paper Results with MemoryAgent

In the empirical study reported in the paper, we focus on MemoryAgent, because the original MemAgent work releases a comprehensive public dataset for both training and evaluation, making it an ideal benchmark for reproducible experiments.

examples/RunMemoryAgent1.7B.sh and examples/RunMemoryAgent4B.sh are the scripts used in the paper to train MemoryAgent on the open-source data released by MemAgent. You can reproduce the reported results with the same scripts:

bash examples/RunMemoryAgent1.7B.sh
bash examples/RunMemoryAgent4B.sh

The training data can be downloaded from datas. After downloading, update DATA_PATH and MODEL_PATH in the scripts to match your local environment. Once training is finished, you can use evaluation/evaluate_orchestrator.py to evaluate checkpoints on the three benchmark test sets used in the paper.

Reported results in the paper (avg@4, averaged over 4 independent runs):

Model Setting eval_50 eval_100 eval_fwe_16384 Average
Qwen3-1.7B Base checkpoint 0.4727 0.4297 0.0332 0.3118
Qwen3-1.7B + MemFactory RL 0.5684 0.4863 0.0195 0.3581
Qwen3-4B-Instruct Base checkpoint 0.6523 0.5645 0.6270 0.6146
Qwen3-4B-Instruct + MemFactory RL 0.7051 0.6309 0.6426 0.6595

3. Further Extension

See "πŸ› οΈ Developer Guide"

βš™οΈ Configuration

Environment Variables (.env)

# LLM service configuration
OPENAI_API_KEY=your-openai-key
OPENAI_BASE_URL=https://api.openai.com/v1
LLM_MODEL=gpt-4.1-nano

# SwanLab configuration (optional)
SWANLAB_API_KEY=your-swanlab-key

# Embedding service configuration
EMBEDDING_API_KEY=your-embedding-key
EMBEDDING_BASE_URL=your-embedding-api-url
EMBEDDING_MODEL=bge-m3
EMBEDDING_DIM=1024

# Neo4j graph database configuration
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password

# Milvus vector database configuration
MILVUS_URI=http://localhost:19530
MILVUS_USER=root
MILVUS_PASSWORD=your-password

πŸ› οΈ Developer Guide

Extending the Framework

# Custom module
@MODULE_REGISTRY.register("naive_extractor")
class NaiveExtractor(BaseModule):
    def __init__(self, tokenizer, device="cuda", **kwargs):
#...

@MODULE_REGISTRY.register("your_extractor")
class YourExtractor(BaseModule):
    def __init__(self, tokenizer, device="cuda", **kwargs):

Training Customization

# Custom reward function
@ENV_REGISTRY.register("your_new_env")
class YourNewEnv(MemoryBankEnv):
    def compute_reward(self, predictions: Dict[str, List[str]],     ground_truths: Dict[str, Any], num_generations: int, **kwargs) -> Dict[str, torch.Tensor]:
        # Implement custom reward calculation
        pass


# Custom training configuration
# Customize the environment, agent, and training parameters during training.

🀝 Contributing and Community

Contributions to MemFactory are welcome.

πŸ“„ License

This project is licensed under the Apache License 2.0.

πŸ“– Citation

If you find MemFactory useful for your research, please cite our paper:

@misc{guo2026memfactoryunifiedinference,
      title={MemFactory: Unified Inference & Training Framework for Agent Memory}, 
      author={Ziliang Guo and Ziheng Li and Bo Tang and Feiyu Xiong and Zhiyu Li},
      year={2026},
      eprint={2603.29493},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.29493}, 
}

πŸ™ Acknowledgements

We thank the open-source community for the excellent tools and libraries, as well as all contributors who have helped improve this project.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages