Welcome to the TyphoonMLA Community Repository!
For more details on TyphoonMLA method, check out our preprint paper.
Run TyphoonMLA on your own machine or on a server.
Results are collected from community contributors. Feel free to create a pull request with your results from different platforms. We will show all benchmark results in the main page.
Start a docker container from our pre-built image:
docker run -it --rm --gpus=all --runtime=nvidia --name tree-mla acyuzuguler/tree-mla:latest /bin/bash
Run experiments and plot results:
bash run.sh
Start a VM with the docker image acyuzuguler/tree-mla:latest
Run experiments and plot results:
bash run.sh
docker build -f dockerfiles/Dockerfile -t tree-mla:latest .
We welcome results & feedback from the community. If you run this code in a new architecture, please add your results below here and create a PR.
@misc{yuzuguler2025typhoonmla,
title={TyphoonMLA: A Mixed Naive-Absorb MLA Kernel For Shared Prefix},
author={Ahmet Caner Yüzügüler and Ahmet Çelik and Jiawei Zhuang and Lukas Cavigelli},
year={2025},
eprint={2509.21081},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2509.21081},
}
