Add QNN-compatible ONNX export for non-streaming zipformer transducer.#2088
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR extends ONNX export configuration for the Zipformer ASR model by adding CLI flags for max sequence length, x_lens retention, input integer width, dynamic axes support, and int8 quantization control. The encoder gains a simplified single-input forward2 path, and export functions conditionally apply new parameters and batch size settings. Main wiring and logging configuration complete the changes. ChangesONNX Export Configuration and Runtime Logging
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces several command-line arguments to export-onnx.py to support exporting Zipformer models with static shapes, int32 inputs, optional dynamic axes, and optional int8 quantization. It also adds a forward2 helper method to handle exports without x_lens and updates the export functions for the encoder, decoder, and joiner. Feedback on these changes points out a typo in a help message, an incorrect docstring for forward2, a global side-effect risk from modifying class-level methods instead of instance-level methods, and a missing type annotation.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| "--dynamic-axes", | ||
| type=int, | ||
| default=1, | ||
| help="1 to support dynamic axes. 0 to diable dynamic axes", |
| """Please see the help information of Zipformer.forward | ||
|
|
||
| Args: | ||
| x: | ||
| A 3-D tensor of shape (N, T, C) | ||
| x_lens: | ||
| A 1-D tensor of shape (N,). Its dtype is torch.int64 | ||
| Returns: | ||
| Return a tuple containing: | ||
| - encoder_out, A 3-D tensor of shape (N, T', joiner_dim) | ||
| - encoder_out_lens, A 1-D tensor of shape (N,) | ||
| """ |
There was a problem hiding this comment.
The docstring for forward2 is incorrect. It lists x_lens as an argument and claims to return a tuple of (encoder_out, encoder_out_lens), but the function only accepts x and returns encoder_out directly.
"""Please see the help information of Zipformer.forward
Args:
x:
A 3-D tensor of shape (N, T, C)
Returns:
Return encoder_out, A 3-D tensor of shape (N, T', joiner_dim)
"""| "encoder_out_lens": {0: "N"}, | ||
| } | ||
| else: | ||
| encoder_model.__class__.forward = encoder_model.__class__.forward2 |
There was a problem hiding this comment.
Modifying the class-level forward method (encoder_model.__class__.forward = ...) is a global side effect that affects all instances of OnnxEncoder (and any future uses of the class). Since encoder_model.forward2 is already a bound method on the instance, you can safely assign it directly to the instance's forward attribute instead.
| encoder_model.__class__.forward = encoder_model.__class__.forward2 | |
| encoder_model.forward = encoder_model.forward2 |
| use_int32_inputs, | ||
| dynamic_axes: int, |
Use https://huggingface.co/reazon-research/reazonspeech-k2-v2 as an example.
You can find its PyTorch checkpoint, created from its onnx model files, at https://huggingface.co/csukuangfj/reazonspeech-k2-v2/tree/main/checkpoint
You would get 3 ONNX files that are suitable for export to QNN.
It may also support other types of NPU but only Qualcomm NPU has been verified.
Summary by CodeRabbit
Release Notes
New Features
Chores