author: Jtss-ux github_id: Jtss-ux BPB: 1.1301 Technique: LeakyReLU(0.5)^2 + Legal TTT (PR #461) + Parallel Muon. Code: https://github.com/Jtss-ux/parameter-golf/blob/main/train_gpt.py
author: Jtss-ux
github_id: Jtss-ux
BPB: 1.1301
Technique: LeakyReLU(0.5)^2 + Legal TTT (PR #461) + Parallel Muon.
Code: https://github.com/Jtss-ux/parameter-golf/blob/main/train_gpt.py