Skip to content

Commit

Permalink
Fix bug in layernorm loop ordering.
Browse files Browse the repository at this point in the history
  • Loading branch information
njeffrie committed Nov 12, 2024
1 parent c09c876 commit a7c0d33
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions src/cpu/kernels.cc
Original file line number Diff line number Diff line change
Expand Up @@ -489,9 +489,11 @@ namespace ctranslate2 {
const float variance = std::max(sum_squares / depth - mean * mean, 0.f);
const float rstd = 1.f / std::sqrt(variance + epsilon);

for (dim_t j = 0; j < depth; j += weights_size) {
int inner_dim = depth / weights_size;
for (dim_t j = 0; j < inner_dim; j ++) {
for (dim_t k = 0; k < weights_size; k++) {
y[j+k] = (x[j+k] - mean) * rstd * gamma[k] + beta[k];
int idx = k * inner_dim + j;
y[idx] = (x[idx] - mean) * rstd * gamma[k] + beta[k];
}
}
}
Expand Down

0 comments on commit a7c0d33

Please sign in to comment.