-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Description
Hi DeepSpeed team,
I’d like to suggest adding support for EXAONE 4.0 (LGAI‑EXAONE/EXAONE‑4.0‑32B and 1.2B) in DeepSpeed.
Currently, this model is officially supported by vLLM (via PR #21060), but DeepSpeed doesn’t yet recognize or integrate it.
This means we can’t fully take advantage of ZeRO, offload, and tensor parallel features with EXAONE 4.0.
EXAONE 4.0 is a hybrid LLM (combining reasoning and non‑reasoning modes) and provides strong performance on reasoning tasks and multilingual content (Korean, English, Spanish). Benchmarks like MMLU‑Redux and MATH show impressive numbers (MMLU‑Redux 92.3, MATH/AIME ~85%).
What we’d like to see:
- Register
Exaone4ForCausalLMin DeepSpeed’s supported model registry. - Add support for initialization, parameter loading, and optimizer state management for the EXAONE 4.0 architecture.
- Ensure compatibility with ZeRO-offload (CPU/NVMe) and tensor parallel inference.
I’d be happy to share test logs or work on a draft PR if needed.
Adding EXAONE 4.0 support would be a great benefit for the growing Korean and multilingual AI community.
Thanks for your time and consideration!