Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

AbhishekAshokDubey · 2023-07-13T10:27:18Z

Updating the forward function in Transformer block.

The change is simple, but still trying my best to explain below:

As per original paper: In 'Add & Norm' block of Transformer, Layer Norm is applied on top of => input/ residual and output of Self-attention. While in the current code, layer Norm is applied first & then added back to the input/ residual.

Updating the forward function in Transformer block. The code is simple to example the pull request, but still trying my best to explain below: As per paper: In 'Add & Norm' block of Transformer, Layer Norm is applied on top of input/ residual & output of Self-attention. While in the current code, first layer Norm is applied & then added back to the input/ residual.

reallyigor · 2023-09-26T15:42:27Z

See 1:35:33

AbhishekAshokDubey changed the title ~~Update gpt.py~~ Minor correction in 'Add & Norm' logic in Block Class in gpt.py Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

Uh oh!

AbhishekAshokDubey commented Jul 13, 2023 •

edited

Loading

Uh oh!

reallyigor commented Sep 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

Are you sure you want to change the base?

Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

Uh oh!

Conversation

AbhishekAshokDubey commented Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

reallyigor commented Sep 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AbhishekAshokDubey commented Jul 13, 2023 •

edited

Loading