Skip to content

Support specifying input types in ONNX export script.#2084

Merged
csukuangfj merged 1 commit into
k2-fsa:masterfrom
csukuangfj:zipformer-v1-streaming-qnn
Jun 1, 2026
Merged

Support specifying input types in ONNX export script.#2084
csukuangfj merged 1 commit into
k2-fsa:masterfrom
csukuangfj:zipformer-v1-streaming-qnn

Conversation

@csukuangfj

@csukuangfj csukuangfj commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

It previously used int64_t for all integer inputs. This PR adds support for both int32_t and int64_t, allowing the integer type to be selected as needed.

Summary by CodeRabbit

  • New Features
    • Added a new command-line option to control integer input tensor data types during ONNX model export. When enabled, encoder and decoder models use 32-bit integers instead of 64-bit integers, providing an alternative configuration option that expands compatibility across different deployment scenarios.

@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Added a CLI flag --use-int32-inputs to control whether encoder and decoder ONNX models use int32 or int64 for input tensors. Updated both export functions to accept this parameter, conditionally casting streaming state and decoder inputs accordingly, and wired the flag through main().

Changes

ONNX int32 input dtype flag

Layer / File(s) Summary
CLI flag definition
egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py
The get_parser() function adds a --use-int32-inputs integer option to allow users to select int32 vs int64 tensor dtypes during export.
Encoder export int32 support
egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py
The export_encoder_model_onnx() function accepts a use_int32_inputs parameter and conditionally converts encoder streaming state tensors from int64 to int32 when the flag is enabled.
Decoder export int32 support
egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py
The export_decoder_model_onnx() function accepts a use_int32_inputs parameter and conditionally creates decoder input tensor y as int32 instead of int64 when the flag is enabled.
Main function integration
egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py
The main() function retrieves params.use_int32_inputs from CLI arguments and passes it to both export_encoder_model_onnx() and export_decoder_model_onnx() to apply the selected integer dtype behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A flag to choose the way we cast,
Int32 swift, or int64 vast,
Through encoder streams and decoder gates,
This ONNX choice now resonates! 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: adding support for specifying input types in the ONNX export script via the new --use-int32-inputs flag.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a --use-int32-inputs option to the ONNX export script to support exporting models with int32 input types instead of int64. The reviewer suggests using str2bool instead of int for the command-line argument and updating the function type hints to bool for consistency. Additionally, the reviewer points out that Decoder.forward explicitly casts inputs to int64, which may introduce unwanted Cast nodes in the exported ONNX graph and defeat the purpose of using int32 inputs.

Comment on lines +449 to +450
if use_int32_inputs:
y = torch.zeros(1, context_size, dtype=torch.int32)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While setting y to dtype=torch.int32 changes the exported ONNX model's input type to int32, Decoder.forward in decoder.py explicitly casts y to torch.int64 via y = y.to(torch.int64). This will introduce a Cast to int64 node and subsequent int64 operations in the exported ONNX graph, which might defeat the purpose of using int32 inputs for hardware accelerators that do not support int64 data types. Consider updating Decoder.forward to avoid casting to int64 if int32 is preferred/supported by the embedding layer in the target ONNX runtime.

Comment on lines +111 to +117
parser.add_argument(
"--use-int32-inputs",
type=int,
default=0,
help="""1 to use int32_t as input types if applicable. 0 to use
int64_t otherwise.""",
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use str2bool instead of int for --use-int32-inputs to be consistent with other boolean flags in the repository (like --use-averaged-model) and to provide a more user-friendly command-line interface.

Suggested change
parser.add_argument(
"--use-int32-inputs",
type=int,
default=0,
help="""1 to use int32_t as input types if applicable. 0 to use
int64_t otherwise.""",
)
parser.add_argument(
"--use-int32-inputs",
type=str2bool,
default=False,
help="""True to use int32_t as input types if applicable. False to use
int64_t otherwise.""",
)

encoder_filename: str,
opset_version: int = 11,
dynamic_batch: bool = True,
use_int32_inputs: int = 0,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Change the type hint of use_int32_inputs to bool to match the recommended str2bool argument type.

Suggested change
use_int32_inputs: int = 0,
use_int32_inputs: bool = False,

decoder_filename: str,
opset_version: int = 11,
dynamic_batch: bool = True,
use_int32_inputs: int = 0,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Change the type hint of use_int32_inputs to bool to match the recommended str2bool argument type.

Suggested change
use_int32_inputs: int = 0,
use_int32_inputs: bool = False,

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py (1)

427-452: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the decoder contract docs for the int32 path.

Lines 431-432 still say y is always torch.int64, but this branch now exports y as torch.int32 when the flag is enabled. That leaves the script documenting the wrong ONNX input type.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py`
around lines 427 - 452, The docstring for the decoder ONNX export incorrectly
states the input 'y' is torch.int64 even though the code conditionally creates
'y' as torch.int32 when use_int32_inputs is true; update the decoder contract
text in the function docstring to reflect that 'y' may be either torch.int32 or
torch.int64 depending on the use_int32_inputs flag (mention use_int32_inputs,
variable y, and decoder_model.decoder.context_size/context_size) so the
documented ONNX input type matches the actual exported type.
🧹 Nitpick comments (1)
egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py (1)

668-687: ⚡ Quick win

Encode the selected input dtype in the exported artifact names.

--use-int32-inputs changes the public ONNX input schema, but the filenames stay identical, so an int32 export overwrites the default int64 bundle and the saved artifacts are indistinguishable without opening the model.

Suggested fix
     suffix += f"-avg-{params.avg}"
     if params.use_averaged_model:
         suffix += "-with-averaged-model"
+    suffix += "-int32-inputs" if params.use_int32_inputs else "-int64-inputs"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py`
around lines 668 - 687, The exported filenames don’t reflect the input dtype so
int32 exports can overwrite int64 ones; update the logic that builds
encoder_filename and decoder_filename (used when calling
export_encoder_model_onnx and export_decoder_model_onnx) to incorporate
params.use_int32_inputs into the suffix (for example append a "-int32" marker
when params.use_int32_inputs is true) so both encoder_filename and
decoder_filename uniquely encode the selected input dtype.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py`:
- Around line 111-117: The --use-int32-inputs argument currently uses
parser.add_argument with type=int which accepts any non-zero integer; restrict
it to only the documented values by adding choices=(0,1) (or an equivalent
validation) on the parser.add_argument call for "--use-int32-inputs" so only 0
or 1 are accepted and adjust the help text if needed to match the enforced
choices.

---

Outside diff comments:
In
`@egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py`:
- Around line 427-452: The docstring for the decoder ONNX export incorrectly
states the input 'y' is torch.int64 even though the code conditionally creates
'y' as torch.int32 when use_int32_inputs is true; update the decoder contract
text in the function docstring to reflect that 'y' may be either torch.int32 or
torch.int64 depending on the use_int32_inputs flag (mention use_int32_inputs,
variable y, and decoder_model.decoder.context_size/context_size) so the
documented ONNX input type matches the actual exported type.

---

Nitpick comments:
In
`@egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py`:
- Around line 668-687: The exported filenames don’t reflect the input dtype so
int32 exports can overwrite int64 ones; update the logic that builds
encoder_filename and decoder_filename (used when calling
export_encoder_model_onnx and export_decoder_model_onnx) to incorporate
params.use_int32_inputs into the suffix (for example append a "-int32" marker
when params.use_int32_inputs is true) so both encoder_filename and
decoder_filename uniquely encode the selected input dtype.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 397bddb5-6818-4a1c-8ae0-8cc032efd96a

📥 Commits

Reviewing files that changed from the base of the PR and between 88c30de and 23953b4.

📒 Files selected for processing (1)
  • egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py

Comment on lines +111 to +117
parser.add_argument(
"--use-int32-inputs",
type=int,
default=0,
help="""1 to use int32_t as input types if applicable. 0 to use
int64_t otherwise.""",
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Restrict --use-int32-inputs to the documented values.

type=int accepts any non-zero value here, so --use-int32-inputs 2 silently exports the int32 variant even though the help text documents only 0 and 1.

Suggested fix
     parser.add_argument(
         "--use-int32-inputs",
         type=int,
         default=0,
+        choices=(0, 1),
         help="""1 to use int32_t as input types if applicable. 0 to use
         int64_t otherwise.""",
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
parser.add_argument(
"--use-int32-inputs",
type=int,
default=0,
help="""1 to use int32_t as input types if applicable. 0 to use
int64_t otherwise.""",
)
parser.add_argument(
"--use-int32-inputs",
type=int,
default=0,
choices=(0, 1),
help="""1 to use int32_t as input types if applicable. 0 to use
int64_t otherwise.""",
)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx-zh.py`
around lines 111 - 117, The --use-int32-inputs argument currently uses
parser.add_argument with type=int which accepts any non-zero integer; restrict
it to only the documented values by adding choices=(0,1) (or an equivalent
validation) on the parser.add_argument call for "--use-int32-inputs" so only 0
or 1 are accepted and adjust the help text if needed to match the enforced
choices.

@csukuangfj csukuangfj merged commit bf478d3 into k2-fsa:master Jun 1, 2026
8 of 11 checks passed
@csukuangfj csukuangfj deleted the zipformer-v1-streaming-qnn branch June 1, 2026 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant