Skip to content

[HipDNN] Layernorm bwd frontend and CPU reference#6566

Open
brentmaas wants to merge 7 commits into
developfrom
users/brentmaas/layernorm-bwd-frontend
Open

[HipDNN] Layernorm bwd frontend and CPU reference#6566
brentmaas wants to merge 7 commits into
developfrom
users/brentmaas/layernorm-bwd-frontend

Conversation

@brentmaas

Copy link
Copy Markdown
Contributor

Motivation

Generate the frontend code and implement the CPU reference for backward layernorm in HipDNN.

Technical Details

  • Generate the backward layernorm frontend code.
  • Fix issues in generated code.
  • Implement a backward layernorm CPU reference.
  • Add tests for frontend and CPU reference.

Test Plan

Build and run the check target.

Test Result

All new and existing tests pass.

Submission Checklist

@brentmaas brentmaas added organization: streamhpc contributors from streamhpc project: hipdnn labels Apr 20, 2026
@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-schema branch from 8452381 to 2497b3d Compare April 22, 2026 12:50
@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-frontend branch 2 times, most recently from 3df0f50 to 9af89c1 Compare April 29, 2026 15:42
@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-schema branch from ca6e262 to 97be077 Compare April 30, 2026 07:49
@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-frontend branch from 9af89c1 to af9142d Compare April 30, 2026 07:49
@EwanC EwanC force-pushed the users/brentmaas/layernorm-bwd-schema branch 2 times, most recently from df3c852 to 963e9de Compare May 22, 2026 07:23
Base automatically changed from users/brentmaas/layernorm-bwd-schema to develop May 22, 2026 11:48
@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-frontend branch 4 times, most recently from 5892360 to 6a69aa7 Compare May 28, 2026 07:42
@codecov-commenter

codecov-commenter commented May 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.09402% with 111 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...s/cpu_graph_executor/detail/LayernormBpropPlan.hpp 83.33% 3 Missing and 22 partials ⚠️
...hipdnn_frontend/detail/LayernormBackwardPacker.hpp 75.00% 4 Missing and 15 partials ⚠️
...pdnn_frontend/detail/LayernormBackwardUnpacker.hpp 81.25% 3 Missing and 15 partials ⚠️
...ude/hipdnn_frontend/node/LayernormBackwardNode.hpp 91.63% 6 Missing and 12 partials ⚠️
...dnn_test_sdk/utilities/CpuFpReferenceLayernorm.hpp 91.72% 9 Missing and 5 partials ⚠️
...scriptors/LayernormBackwardOperationDescriptor.cpp 98.59% 0 Missing and 5 partials ⚠️
.../hipdnn/frontend/include/hipdnn_frontend/Graph.hpp 88.89% 3 Missing and 2 partials ⚠️
...n/data_sdk/include/hipdnn_data_sdk/types/Int32.hpp 0.00% 3 Missing ⚠️
...ects/hipdnn/backend/src/BackendEnumStringUtils.hpp 92.31% 1 Missing and 1 partial ⚠️
...aph_executor/detail/LayernormBpropSignatureKey.hpp 98.40% 0 Missing and 2 partials ⚠️

❌ Your project status has failed because the head coverage (76.92%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6566      +/-   ##
===========================================
+ Coverage    71.33%   71.40%   +0.07%     
===========================================
  Files         2628     2636       +8     
  Lines       413045   414443    +1398     
  Branches     61875    62073     +198     
===========================================
+ Hits        294615   295905    +1290     
- Misses       96656    96686      +30     
- Partials     21774    21852      +78     
Flag Coverage Δ *Carryforward flag
TensileLite 76.65% <ø> (ø) Carriedforward from 0fd8b2c
hipBLAS 90.81% <ø> (ø) Carriedforward from 0fd8b2c
hipBLASLt 41.35% <ø> (ø) Carriedforward from 0fd8b2c
hipCUB 82.68% <ø> (ø) Carriedforward from 0fd8b2c
hipDNN 86.08% <92.09%> (+0.17%) ⬆️
hipFFT 50.17% <ø> (ø) Carriedforward from 0fd8b2c
hipRAND 76.12% <ø> (ø) Carriedforward from 0fd8b2c
hipSOLVER 69.18% <ø> (ø) Carriedforward from 0fd8b2c
hipSPARSE 86.55% <ø> (ø) Carriedforward from 0fd8b2c
rocBLAS 48.06% <ø> (ø) Carriedforward from 0fd8b2c
rocFFT 46.30% <ø> (ø) Carriedforward from 0fd8b2c
rocRAND 57.07% <ø> (ø) Carriedforward from 0fd8b2c
rocSOLVER 76.92% <ø> (ø) Carriedforward from 0fd8b2c
rocSPARSE 72.37% <ø> (ø) Carriedforward from 0fd8b2c
rocThrust 91.36% <ø> (ø) Carriedforward from 0fd8b2c

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
...pdnn/backend/src/descriptors/DescriptorFactory.cpp 96.30% <100.00%> (+0.08%) ⬆️
...scriptors/LayernormBackwardOperationDescriptor.hpp 100.00% <100.00%> (ø)
...cts/hipdnn/backend/src/descriptors/NodeFactory.cpp 100.00% <100.00%> (ø)
...rontend/attributes/LayernormBackwardAttributes.hpp 100.00% <100.00%> (ø)
...clude/hipdnn_frontend/detail/OperationUnpacker.hpp 100.00% <100.00%> (ø)
...s/cpu_graph_executor/CpuReferenceGraphExecutor.hpp 62.61% <100.00%> (+0.66%) ⬆️
...aph_executor/detail/LayernormFpropSignatureKey.hpp 86.99% <100.00%> (-0.10%) ⬇️
.../cpu_graph_executor/detail/PlanBuilderRegistry.hpp 100.00% <ø> (ø)
...graph_executor/detail/PlanRegistrySignatureKey.hpp 100.00% <ø> (ø)
...ects/hipdnn/backend/src/BackendEnumStringUtils.hpp 98.69% <92.31%> (-0.16%) ⬇️
... and 9 more

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-frontend branch 2 times, most recently from 4a09af0 to c253cda Compare June 1, 2026 15:48
@brentmaas brentmaas marked this pull request as ready for review June 2, 2026 11:55
@brentmaas brentmaas requested a review from a team as a code owner June 2, 2026 11:55
@brentmaas brentmaas requested review from EwanC and sbalint98 June 2, 2026 12:06
Comment thread projects/hipdnn/backend/include/HipdnnBackendDescriptorType.h Outdated
Comment thread projects/hipdnn/frontend/tests/TestGraphLayernormBackward.cpp
Comment thread projects/hipdnn/frontend/tests/TestLayernormBackwardAttributes.cpp
Comment thread projects/hipdnn/frontend/tests/TestLayernormBackwardAttributes.cpp Outdated
Comment thread projects/hipdnn/frontend/tests/TestLayernormBackwardNode.cpp Outdated
Comment thread projects/hipdnn/frontend/tests/TestLayernormBackwardAttributes.cpp
@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-frontend branch from 2c0b8c5 to 328b560 Compare June 5, 2026 10:09
^ (static_cast<std::size_t>(static_cast<int>(dxDataType)) << 16)
^ (static_cast<std::size_t>(static_cast<int>(scaleBiasDataType)) << 8)
^ (static_cast<std::size_t>(static_cast<int>(meanInvVarianceDataType)) << 12)
^ (static_cast<std::size_t>(static_cast<int>(outputDataType)) << 16)
^ (static_cast<std::size_t>(static_cast<int>(computeDataType)) << 20);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think here we can drop the cast to int.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must've copied that from the forward pass. I've removed it in both the forward and backward pass now.

@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-frontend branch 2 times, most recently from bdb7495 to 8b51606 Compare June 5, 2026 12:20

@EwanC EwanC left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, couple more minor comments

Comment thread projects/hipdnn/backend/src/BackendEnumStringUtils.hpp
Comment thread projects/hipdnn/backend/tests/TestBackendEnumStringUtils.cpp Outdated
@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-frontend branch from e85af0b to 519d1d2 Compare June 5, 2026 16:42
Comment on lines +188 to +189
CHECK_TENSOR_TYPE(tensorMap, nodeAttributes->dscale_tensor_uid(), OutputDataTypeEnum);
CHECK_TENSOR_TYPE(tensorMap, nodeAttributes->dbias_tensor_uid(), OutputDataTypeEnum);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These type checks don't match the execution parameter types in projects/hipdnn/test_sdk/include/hipdnn_test_sdk/utilities/CpuFpReferenceLayernorm.hpp bprop.

I think they should be verified against ScaleBiasDataTypeEnum.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor mistake, should be fixed now.

Comment on lines +129 to +138
utilities::CpuFpReferenceLayernorm::bprop(*shallowDyTensor,
*shallowXTensor,
*shallowScaleTensor,
*shallowDxTensor,
*shallowDscaleTensor,
*shallowDbiasTensor,
epsilon,
shallowMeanTensor.get(),
shallowInvVarianceTensor.get(),
_params.normalizedDimCount);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ComputeDataType template param isn't passed explicitly so it will default to float. You will need to explicitly pass the template params.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must've misjudged that ComputeDataType would be used by the arguments. Fixed now.

Comment on lines +71 to +80
// Mean/inv_variance type: use mean if present, otherwise default to IO type (dy type)
if(nodeAttributes->mean_tensor_uid().has_value())
{
auto meanTensorAttr = tensorMap.at(nodeAttributes->mean_tensor_uid().value());
meanInvVarianceDataType = meanTensorAttr->data_type();
}
else
{
meanInvVarianceDataType = dyDataType;
}

@SamuelReeder SamuelReeder Jun 11, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to set meanInvVarianceDataType if the tensors are omitted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was necessary to set meanInvVarianceDataType to dyDataType when mean and inverse variance were omitted for signature key lookup. I've tried to set it to UNSET instead and add plan builders for the omitted case, but that led to a bunch of compilation issues everywhere with void being the type, so I think this is cleaner solution for now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are more tests to be added here to coincide with similar lowering integration files.

Comment on lines +12 to +13
// Standard LayernormBackward constants for testing get/set of valid operations.
// These represent "any valid layernormbackward" — specific values are not significant.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specific values are insignificant, but we at least try to have UID uniqueness across all these constant files to avoid mixing up constants. Please check to make sure these UIDs values are unique, and if not, find a unique range.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the UIDs to unique values.

Comment on lines +1257 to +1285
/** @brief Output gradient tensor for backward layernorm */
HIPDNN_ATTR_OPERATION_LAYERNORM_BACKWARD_DY = 3600,

/** @brief Input tensor for backward layernorm */
HIPDNN_ATTR_OPERATION_LAYERNORM_BACKWARD_X = 3601,

/** @brief Scale tensor for backward layernorm */
HIPDNN_ATTR_OPERATION_LAYERNORM_BACKWARD_SCALE = 3602,

/** @brief Mean tensor for backward layernorm */
HIPDNN_ATTR_OPERATION_LAYERNORM_BACKWARD_MEAN = 3603,

/** @brief Inverse variance tensor for backward layernorm */
HIPDNN_ATTR_OPERATION_LAYERNORM_BACKWARD_INV_VARIANCE = 3604,

/** @brief Epsilon tensor for backward layernorm */
HIPDNN_ATTR_OPERATION_LAYERNORM_BACKWARD_EPSILON = 3605,

/** @brief Input gradient tensor for backward layernorm */
HIPDNN_ATTR_OPERATION_LAYERNORM_BACKWARD_DX = 3606,

/** @brief Scale gradient tensor for backward layernorm */
HIPDNN_ATTR_OPERATION_LAYERNORM_BACKWARD_DSCALE = 3607,

/** @brief Bias gradient tensor for backward layernorm */
HIPDNN_ATTR_OPERATION_LAYERNORM_BACKWARD_DBIAS = 3608,

/** @brief Number of normalized dimensions for backward layernorm */
HIPDNN_ATTR_LAYERNORM_BACKWARD_NORMALIZED_DIM_COUNT = 3609,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed it now (including the strings). I don't think some of these documents existed when I did this.

Comment on lines +96 to +104
// Infer output shape and strides if not set
if(attributes.get_dx()->get_dim().empty())
{
attributes.get_dx()->set_dim(attributes.get_x()->get_dim());
}
if(attributes.get_dx()->get_stride().empty())
{
attributes.get_dx()->set_stride(attributes.get_x()->get_stride());
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably infer dy too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed that, probably because I thought that the input vectors would be known. The minimum set of known input shapes should now be x and scale.

return _wrapper->asDescriptor<LayernormBackwardOperationDescriptor>();
}

void setTensors() const

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should set normalized_dim_count here too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, including the relevant tests.

namespace hipdnn_backend
{

void LayernormBackwardOperationDescriptor::finalize()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also check that mean and inv_variance are only dually present.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to finalize and its tests.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the addition to this function.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see which function you mean, but I assume it was hipdnnGetOperationTypeString. I've fixed a missing case there.

hipdnn_data_sdk::utilities::iterateAlongDimensions(
normalizedDims, [&](const std::vector<int64_t>& normIndices) {
auto fullIndices
= buildFullIndices(batchIndices, normIndices, ndim, normalizedDimCount);

@SamuelReeder SamuelReeder Jun 11, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildFullIndices is broken for mixed convention tensors, where only some are 1-padded.

Suppose:

scale.shape = [C, H, W]        // reduced-rank
mean.shape  = [N, 1, 1, 1]     // one-padded

And we have:

batchIndices.size() == 4
normIndices.size()  == 3

The helper will assume both are reduced-rank, and will create:

fullIndices = [n, 0, 0, 0, c, h, w]

We could pass reduced rank indices here always, but I think it's better to modify the helper to compute batchRank = ndim - normalizedDimCount; and take batchRank indices from batchIndices, and take the last normalizedDimCount entries from normIndices.

We should also make sure these cases are tested.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rewritten buildFullIndices to just take normalizedDimCount or ndim - normalizedDimCount dimensions and added some extra tests to cover odd cases.

const auto& dims = dy.dims();
auto ndim = static_cast<int64_t>(dims.size());

if(ndim < 1)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For optional improved handling, consider also checking:

  • dy.dims() == x.dims()
  • dx.dims() == x.dims()
  • scale shape is compatible with normalizedDimCount
  • mean/rstd shape is compatible with batch dims
  • one-padded and reduced-rank combinations are valid

Some of these may also be appropriate for pre_validate_node.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a bunch of checks to pre_validate_node and added tests for them.

// Builds a standard LayernormBackward graph, lowers via build_operation_graph(handle),
// lifts back with fromBackendDescriptor(), and performs comprehensive field-by-field
// validation of graph data types, tensor attributes, and operation parameters.
TEST_F(IntegrationLayernormBackwardDescriptorLifting, BasicLayernormBackwardRoundTrip)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify optional tensors too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +199 to +201
// After lifting, verifies tensor objects in the node attributes are the same
// shared_ptr instances as in the tensor map (pointer equality).
TEST_F(IntegrationLayernormBackwardDescriptorLifting, LayernormBackwardTensorSharingPreserved)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth verifying optional tensors again. I think LayernormBackwardLiftWithoutFinalization is the same too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done for all relevant tests in this file.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're missing any verification that normalized_dim_count is preserved through lifting.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, so I've added checks to this file where relevant.

@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-frontend branch from 519d1d2 to 53fb612 Compare July 1, 2026 13:55
@therock-pr-bot

therock-pr-bot Bot commented Jul 1, 2026

Copy link
Copy Markdown

❌ PR Check — Action Required

Check Status Details
🌿 Branch Name ✅ Pass
📝 PR Title/Description ❌ Fail Error: Title does not follow Conventional Commits style.
Expected: start with a valid type (feat, fix, docs, …).
Desired format: type(optional-scope): short description
───
Error: PR description must reference a JIRA ID, ISSUE ID, or a GitHub closing keyword.
Expected: include a JIRA ID / ISSUE ID line (separator : or -, or omitted; value may be a JIRA key, a number with/without #, or a link), OR a closing keyword + issue reference. Accepted examples:
JIRA ID : TESTAUTO-6039
JIRA ID - #330
JIRA ID #330
ISSUE ID : TESTUTO-3334
ISSUE ID #3334
ISSUE ID - TESTAUTO-3433
ISSUE ID : https://github.com/<org_name>/<repo_name>/issues/1234
Closes #10
Fixes octo-org/octo-repo#100
Resolves: #123
#123
https://github.com/<org_name>/<repo_name>/issues/123
Current: no valid JIRA/ISSUE/closing-keyword reference found
Forbidden Files ✅ Pass
🧪 Unit Test ❌ Fail Error: Source/code files changed without an accompanying unit test.
Expected: add at least one test file named like test_<name>.py / test_<name>.cpp (or <name>_test.*).
Current: code file(s) changed: projects/hipdnn/backend/include/HipdnnBackendAttributeName.h, projects/hipdnn/backend/include/HipdnnBackendDescriptorType.h, projects/hipdnn/backend/include/HipdnnOperationType.h, projects/hipdnn/backend/src/BackendEnumStringUtils.hpp, projects/hipdnn/backend/src/descriptors/DescriptorFactory.cpp (+38 more); no test file found
🔎 pre-commit ❌ Fail Error: Check concluded with failure.
🚫 Draft PR 🔜 To Be Enabled
🚩 Feature Flag 🔜 To Be Enabled
📊 Code Coverage 🔜 To Be Enabled

⚠️ 3 policy check(s) failed. Please address the issues above before this PR can be Reviewed.

🚫 Please fix the failed policies

  • ❌ PR Title/Description
  • ❌ Unit Test
  • ❌ pre-commit

The Not ready to Review label was added to this PR. Once all policies pass, the label is removed automatically.

📖 Need help? See the Policy FAQ for details on every check and how to fix failures.

@therock-pr-bot

therock-pr-bot Bot commented Jul 1, 2026

Copy link
Copy Markdown

🚫 Please fix the failed policies before requesting reviews.

The following policy checks failed:

  • ❌ PR Title/Description
  • ❌ Unit Test

The Not ready to Review label has been added to this PR.
Once all policies pass, the label will be removed automatically.

@brentmaas brentmaas force-pushed the users/brentmaas/layernorm-bwd-frontend branch from a24b304 to 832ec00 Compare July 1, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants