-
-
Notifications
You must be signed in to change notification settings - Fork 130
Open
Description
Motivation and description
Currently gelu_tanh uses sigmoid which prevents us from pattern matching and fusing the gelu into gemm calls for dense layers. See EnzymeAD/Reactant.jl#1420 for details. cc @wsmoses
Possible Implementation
Rename the current gelu_tanh to gelu_sigmoid. Re-implement gelu_tanh to follow the original paper implementation
Copilot
Metadata
Metadata
Assignees
Labels
No labels