Skip to content

Conversation

Aristide021
Copy link

This commit introduces a vendor-neutral NPU backend that demonstrates architectural patterns common across Neural Processing Units.

The implementation covers key NPU concepts including multi-tier memory hierarchy management, automatic tiling for large tensors, quantization handling, and specialized execution engines. It shows how NPUs manage memory across different tiers (L0/L1/L2/L3), tile operations to fit in on-chip SRAM, and dispatch operations to dedicated compute units.

This serves as an educational template for developers creating NPU backends, demonstrating BYOC integration while teaching NPU-specific optimization strategies. Uses CPU emulation for testing without requiring actual NPU hardware.

CC @tqchen - This addresses your feedback from #18201 regarding generic NPU BYOC tutorials.

@tqchen
Copy link
Member

tqchen commented Aug 28, 2025

cc @mshr-h can you help to take a look

@Aristide021 Aristide021 force-pushed the contrib-npu-generic branch 5 times, most recently from 8fcab1b to f58f54a Compare August 28, 2025 20:03
@mshr-h
Copy link
Contributor

mshr-h commented Aug 29, 2025

@tvm-bot rerun

@cbalint13
Copy link
Contributor

cbalint13 commented Aug 29, 2025

@Aristide021 ,

Very nice work, congratulations!

As a simple user who might have such interests, walking through the contrib section, could there be a simple README.md companion in this ‎python/tvm/relax/backend/contrib/example_npu with basic description of this folder's content ? It could describe a summary/purpose/technical/diagram (not necessarily all enumerated) perhaps even output results of the examples.

E.g. the description given here in the header of the PR is very useful but users don't "immediately" read the originating PR.

Copy link
Contributor

@mshr-h mshr-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aristide021
Thank you for the PR! Overall looks good to me. Please fix the CI error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put the test under tests/python/contrib/.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mshr-h Thank you for the review! I've adressed both the CI error and test file location in the latest commit. The tests shoulld now pass and all lint checks are clean.

…cepts

This commit introduces a vendor-neutral NPU backend that demonstrates
architectural patterns common across Neural Processing Units.

The implementation covers key NPU concepts including multi-tier memory
hierarchy management, automatic tiling for large tensors, quantization
handling, and specialized execution engines. It shows how NPUs manage
memory across different tiers (L0/L1/L2/L3), tile operations to fit
in on-chip SRAM, and dispatch operations to dedicated compute units.

This serves as an educational template for developers creating NPU
backends, demonstrating BYOC integration while teaching NPU-specific
optimization strategies. Uses CPU emulation for testing without
requiring actual NPU hardware.

Addresses feedback from apache#18201 requesting generic NPU BYOC tutorials.
@Aristide021
Copy link
Author

@Aristide021 ,

Very nice work, congratulations!

As a simple user who might have such interests, walking through the contrib section, could there be a simple README.md companion in this ‎python/tvm/relax/backend/contrib/example_npu with basic description of this folder's content ? It could describe a summary/purpose/technical/diagram (not necessarily all enumerated) perhaps even output results of the examples.

E.g. the description given here in the header of the PR is very useful but users don't "immediately" read the originating PR.

@cbalint13 Thank you for the feedback! I've added a comprehensive README.md in the latest commit that
includes context and documentation that a user would need to understand when implementing an NPU backend.

@Aristide021 Aristide021 force-pushed the contrib-npu-generic branch 2 times, most recently from 8965649 to fdc0fa3 Compare August 30, 2025 14:35
- Fix pylint broad exception catching warnings by adding specific disable comments
- Add proper exception handling for operators that may not be registered
- Move test file to tests/python/contrib/ directory as requested by reviewer
- Update test to only expect core patterns and check for available activation patterns
- Fix trailing whitespace formatting issue
- Create README with comprehensive documentation of all features

This addresses the CI lint failures and test failures reported in the PR review.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include an instruction how to enable the example NPU runtime and codegen.

# This imports the example module used in the tests. Importing the test
# module path directly works when running from the repo root (pytest does
# this automatically).
from tests.python.contrib.test_example_npu import MatmulReLU
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to copy-and-paste the MatmulReLU definition instead of importing from test code so that users don't have to look around the codebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worthy moving this to our doc: https://github.com/apache/tvm/tree/main/docs/how_to/tutorials not just one README.md

Copy link
Contributor

@mshr-h mshr-h Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this have to be included in the CMake source list?
And I guess we need to define something like USE_EXAMPLE_NPU_RUNTIME and USE_EXAMPLE_NPU_CODEGEN for cmake.

- **Graceful degradation**: Continues operation when optional operators are unavailable
- **Comprehensive testing**: Validates both successful cases and error conditions

## Context
Copy link
Contributor

@mshr-h mshr-h Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to place this chapter at the beginning of the text so we can understand the motivation behind the NPU integration.

# This imports the example module used in the tests. Importing the test
# module path directly works when running from the repo root (pytest does
# this automatically).
from tests.python.contrib.test_example_npu import MatmulReLU
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worthy moving this to our doc: https://github.com/apache/tvm/tree/main/docs/how_to/tutorials not just one README.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants