-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Backend][Relax] Add NPU BYOC backend example #18247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
cc @mshr-h can you help to take a look |
8fcab1b
to
f58f54a
Compare
@tvm-bot rerun |
Very nice work, congratulations! As a simple user who might have such interests, walking through the E.g. the description given here in the header of the PR is very useful but users don't "immediately" read the originating PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Aristide021
Thank you for the PR! Overall looks good to me. Please fix the CI error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put the test under tests/python/contrib/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mshr-h Thank you for the review! I've adressed both the CI error and test file location in the latest commit. The tests shoulld now pass and all lint checks are clean.
…cepts This commit introduces a vendor-neutral NPU backend that demonstrates architectural patterns common across Neural Processing Units. The implementation covers key NPU concepts including multi-tier memory hierarchy management, automatic tiling for large tensors, quantization handling, and specialized execution engines. It shows how NPUs manage memory across different tiers (L0/L1/L2/L3), tile operations to fit in on-chip SRAM, and dispatch operations to dedicated compute units. This serves as an educational template for developers creating NPU backends, demonstrating BYOC integration while teaching NPU-specific optimization strategies. Uses CPU emulation for testing without requiring actual NPU hardware. Addresses feedback from apache#18201 requesting generic NPU BYOC tutorials.
f58f54a
to
a8ddc87
Compare
@cbalint13 Thank you for the feedback! I've added a comprehensive README.md in the latest commit that |
8965649
to
fdc0fa3
Compare
- Fix pylint broad exception catching warnings by adding specific disable comments - Add proper exception handling for operators that may not be registered - Move test file to tests/python/contrib/ directory as requested by reviewer - Update test to only expect core patterns and check for available activation patterns - Fix trailing whitespace formatting issue - Create README with comprehensive documentation of all features This addresses the CI lint failures and test failures reported in the PR review.
fdc0fa3
to
10825bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include an instruction how to enable the example NPU runtime and codegen.
# This imports the example module used in the tests. Importing the test | ||
# module path directly works when running from the repo root (pytest does | ||
# this automatically). | ||
from tests.python.contrib.test_example_npu import MatmulReLU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to copy-and-paste the MatmulReLU definition instead of importing from test code so that users don't have to look around the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worthy moving this to our doc: https://github.com/apache/tvm/tree/main/docs/how_to/tutorials not just one README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this have to be included in the CMake source list?
And I guess we need to define something like USE_EXAMPLE_NPU_RUNTIME
and USE_EXAMPLE_NPU_CODEGEN
for cmake.
- **Graceful degradation**: Continues operation when optional operators are unavailable | ||
- **Comprehensive testing**: Validates both successful cases and error conditions | ||
|
||
## Context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to place this chapter at the beginning of the text so we can understand the motivation behind the NPU integration.
# This imports the example module used in the tests. Importing the test | ||
# module path directly works when running from the repo root (pytest does | ||
# this automatically). | ||
from tests.python.contrib.test_example_npu import MatmulReLU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worthy moving this to our doc: https://github.com/apache/tvm/tree/main/docs/how_to/tutorials not just one README.md
This commit introduces a vendor-neutral NPU backend that demonstrates architectural patterns common across Neural Processing Units.
The implementation covers key NPU concepts including multi-tier memory hierarchy management, automatic tiling for large tensors, quantization handling, and specialized execution engines. It shows how NPUs manage memory across different tiers (L0/L1/L2/L3), tile operations to fit in on-chip SRAM, and dispatch operations to dedicated compute units.
This serves as an educational template for developers creating NPU backends, demonstrating BYOC integration while teaching NPU-specific optimization strategies. Uses CPU emulation for testing without requiring actual NPU hardware.
CC @tqchen - This addresses your feedback from #18201 regarding generic NPU BYOC tutorials.