Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coding mistake #54

Open
hlxs-c opened this issue Jan 11, 2025 · 1 comment
Open

Coding mistake #54

hlxs-c opened this issue Jan 11, 2025 · 1 comment

Comments

@hlxs-c
Copy link

hlxs-c commented Jan 11, 2025

File: Codes/ch03/01_main-chapter-code/ch03.ipynb

CausalAttention 类中的forward方法,对注意力分数进行softmax的部分有问题:
image

这里如果设置为dim=1,就不能处理批次情况了。在输入的形状为[batch_size, num_tokens, d_in] 时,attn_scores的形状为[batch_size, num_tokens, num_tokens],此时应该是在dim=2上进行softmax才是正确吧,所以应该为dim=-1。

@HeiBoWang
Copy link

HeiBoWang commented Jan 11, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants