Add QNN-compatible ONNX export for non-streaming zipformer transducer. by csukuangfj · Pull Request #2088 · k2-fsa/icefall

csukuangfj · 2026-06-11T05:36:48Z

Use https://huggingface.co/reazon-research/reazonspeech-k2-v2 as an example.

You can find its PyTorch checkpoint, created from its onnx model files, at https://huggingface.co/csukuangfj/reazonspeech-k2-v2/tree/main/checkpoint

  ./zipformer/export-onnx.py \
    --enable-int8-quantization 0 \
    --max-len 1000 \
    --keep-x-lens 0 \
    --use-int32-inputs 1 \
    --dynamic-axes 0 \
    --epoch 99 \
    --avg 1 \
    --use-averaged-model 0 \
    --exp-dir ./reazonspeech-k2-v2/checkpoint \
    --tokens ./reazonspeech-k2-v2/tokens.txt \
    \
    --num-encoder-layers 2,2,4,5,4,2 \
    --feedforward-dim 512,768,1536,2048,1536,768 \
    --encoder-dim 192,256,512,768,512,256 \
    --encoder-unmasked-dim 192,192,256,320,256,192

You would get 3 ONNX files that are suitable for export to QNN.
It may also support other types of NPU but only Qualcomm NPU has been verified.

Summary by CodeRabbit

Release Notes

New Features
- Enhanced ONNX export configuration with new options for sequence length, input data types, and quantization control
- Added flexible encoder export variants with configurable input/output signatures
Chores
- Adjusted ONNX Runtime logging verbosity for improved runtime control

coderabbitai · 2026-06-11T05:37:01Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a89b4d87-02d4-41a5-9c62-e413f1254437

📥 Commits

Reviewing files that changed from the base of the PR and between 27e14b3 and 535b1d6.

📒 Files selected for processing (2)

egs/librispeech/ASR/zipformer/export-onnx.py
egs/librispeech/ASR/zipformer/onnx_pretrained.py

📝 Walkthrough

Walkthrough

This PR extends ONNX export configuration for the Zipformer ASR model by adding CLI flags for max sequence length, x_lens retention, input integer width, dynamic axes support, and int8 quantization control. The encoder gains a simplified single-input forward2 path, and export functions conditionally apply new parameters and batch size settings. Main wiring and logging configuration complete the changes.

Changes

ONNX Export Configuration and Runtime Logging

Layer / File(s)	Summary
CLI arguments and encoder forward2 variant `egs/librispeech/ASR/zipformer/export-onnx.py`	New CLI arguments (`--max-len`, `--keep-x-lens`, `--use-int32-inputs`, `--dynamic-axes`, `--enable-int8-quantization`) are added to the argument parser. OnnxEncoder gains a forward2 method that provides a simplified single-input, single-output inference path for batch-size-1 traces by deriving x_lens internally and returning only encoder_out.
Encoder export with conditional forward/forward2 selection `egs/librispeech/ASR/zipformer/export-onnx.py`	export_encoder_model_onnx signature expands to accept max_len, dynamic_axes, use_int32_inputs, and keep_x_lens. The implementation conditionally selects between the original forward signature (with x_lens and encoder_out_lens outputs) and the new forward2 signature (single encoder_out output) based on keep_x_lens. x_lens dtype and dynamic_axes are controlled by the new parameters.
Decoder and Joiner export configuration updates `egs/librispeech/ASR/zipformer/export-onnx.py`	export_decoder_model_onnx and export_joiner_model_onnx signatures accept use_int32_inputs and dynamic_axes. Decoder dummy input y is conditionally created as torch.int32 or torch.int64. Both functions conditionally pass dynamic_axes to ONNX export. Joiner export batch size for dummy inputs is reduced from 11 to 1.
Main function parameter wiring and int8 quantization guard `egs/librispeech/ASR/zipformer/export-onnx.py`	main() wires new CLI parameters into the three export functions (encoder, decoder, joiner). An early return guard skips int8 quantization export logic when enable_int8_quantization is false.
ONNX Runtime logging severity configuration `egs/librispeech/ASR/zipformer/onnx_pretrained.py`	OnnxModel.init now explicitly sets session_opts.log_severity_level = 3 to control ONNX Runtime logging verbosity.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

k2-fsa/icefall#2084: Adds similar --use-int32-inputs export control and int32/int64 casting logic for encoder and decoder integer inputs in a different ASR variant's export script.
k2-fsa/icefall#2086: Adjusts Zipformer ONNX export masking and input typing to use torch.int32 pathways, overlapping directly with this PR's int32 input handling.

Poem

🐰 Hops through exports with flags held high,
forward2 simplifies the path nearby,
Dynamic axes dance, int32 takes flight,
While logging whispers at proper height.
Batches shrink small, the config grows bright! ✨

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces several command-line arguments to export-onnx.py to support exporting Zipformer models with static shapes, int32 inputs, optional dynamic axes, and optional int8 quantization. It also adds a forward2 helper method to handle exports without x_lens and updates the export functions for the encoder, decoder, and joiner. Feedback on these changes points out a typo in a help message, an incorrect docstring for forward2, a global side-effect risk from modifying class-level methods instead of instance-level methods, and a missing type annotation.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-11T05:38:07Z

+        "--dynamic-axes",
+        type=int,
+        default=1,
+        help="1 to support dynamic axes. 0 to diable dynamic axes",


There is a typo in the help message for --dynamic-axes: diable should be disable.

Suggested change

help="1 to support dynamic axes. 0 to diable dynamic axes",

help="1 to support dynamic axes. 0 to disable dynamic axes",

gemini-code-assist · 2026-06-11T05:38:07Z

+        """Please see the help information of Zipformer.forward
+
+        Args:
+          x:
+            A 3-D tensor of shape (N, T, C)
+          x_lens:
+            A 1-D tensor of shape (N,). Its dtype is torch.int64
+        Returns:
+          Return a tuple containing:
+            - encoder_out, A 3-D tensor of shape (N, T', joiner_dim)
+            - encoder_out_lens, A 1-D tensor of shape (N,)
+        """


The docstring for forward2 is incorrect. It lists x_lens as an argument and claims to return a tuple of (encoder_out, encoder_out_lens), but the function only accepts x and returns encoder_out directly.

"""Please see the help information of Zipformer.forward Args: x: A 3-D tensor of shape (N, T, C) Returns: Return encoder_out, A 3-D tensor of shape (N, T', joiner_dim) """

gemini-code-assist · 2026-06-11T05:38:07Z

+            "encoder_out_lens": {0: "N"},
+        }
+    else:
+        encoder_model.__class__.forward = encoder_model.__class__.forward2


Modifying the class-level forward method (encoder_model.__class__.forward = ...) is a global side effect that affects all instances of OnnxEncoder (and any future uses of the class). Since encoder_model.forward2 is already a bound method on the instance, you can safely assign it directly to the instance's forward attribute instead.

Suggested change

encoder_model.__class__.forward = encoder_model.__class__.forward2

encoder_model.forward = encoder_model.forward2

gemini-code-assist · 2026-06-11T05:38:07Z

+    use_int32_inputs,
+    dynamic_axes: int,


The parameter use_int32_inputs is missing a type annotation. Please add : int to keep it consistent with the other parameters in this function.

Suggested change

use_int32_inputs,

dynamic_axes: int,

use_int32_inputs: int,

dynamic_axes: int,

Add QNN-compatible ONNX export for non-streaming zipformer transducer.

535b1d6

csukuangfj merged commit 36e0420 into k2-fsa:master Jun 11, 2026
91 of 128 checks passed

csukuangfj deleted the zipformer-export-qnn branch June 11, 2026 05:37

gemini-code-assist Bot reviewed Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add QNN-compatible ONNX export for non-streaming zipformer transducer.#2088

Add QNN-compatible ONNX export for non-streaming zipformer transducer.#2088
csukuangfj merged 1 commit into
k2-fsa:masterfrom
csukuangfj:zipformer-export-qnn

csukuangfj commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	help="1 to support dynamic axes. 0 to diable dynamic axes",
	help="1 to support dynamic axes. 0 to disable dynamic axes",

	encoder_model.__class__.forward = encoder_model.__class__.forward2
	encoder_model.forward = encoder_model.forward2

Uh oh!

Conversation

csukuangfj commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

csukuangfj commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading