feat: support post-norm architecture by Dogacel · Pull Request #97 · lightseekorg/TorchSpec

Dogacel · 2026-05-12T18:00:06Z

Post-norm architecture out-performs the pre-norm for speculative decoding models. Changes required

Feed the model its own hidden states after the norm.
Norm each target model hidden state before inputting to FC.

Short blog: https://x.com/dogacel0/status/2054200111043949012?s=20

Comparison of acceptance lengths & throughput. This method shines especially on long context.

I am interested to see if this new architecture will benefit to Kimi drafters.

Disclaimer: I only had 1-GPU and couldn't test the training pipeline E2E, only the unit tests were run. Let me know if there is a way to launch training on 1 GPU.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 226188362f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

yubofredwang · 2026-05-13T09:12:17Z

Nice work! I will kick off trainings to verify

support post-norm architecture

2261883

chatgpt-codex-connector Bot reviewed May 12, 2026

View reviewed changes

Comment thread torchspec/models/draft/deepseek_eagle.py Outdated

use config hidden size

17a423d

zhyncs approved these changes May 13, 2026

View reviewed changes

zhyncs merged commit 068f253 into lightseekorg:main May 13, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support post-norm architecture#97

feat: support post-norm architecture#97
zhyncs merged 2 commits into
lightseekorg:mainfrom
Dogacel:attention-drift

Dogacel commented May 12, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

yubofredwang commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Dogacel commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

yubofredwang commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Dogacel commented May 12, 2026 •

edited

Loading