Skip to content

feat(performance): migrate sentiment inference to ONNX runtime via Optimum#19

Open
kushagarwal2910-lang wants to merge 1 commit intoruxailab:mainfrom
kushagarwal2910-lang:edit
Open

feat(performance): migrate sentiment inference to ONNX runtime via Optimum#19
kushagarwal2910-lang wants to merge 1 commit intoruxailab:mainfrom
kushagarwal2910-lang:edit

Conversation

@kushagarwal2910-lang
Copy link
Copy Markdown

@kushagarwal2910-lang kushagarwal2910-lang commented Mar 2, 2026

Resolves #18

Context

To meet the scalability goals for upcoming developments, the API needs to process audio chunks faster without ballooning cloud compute costs. Native PyTorch execution on CPU is unoptimized for production throughput.

Changes Made

  • Added optimum[onnxruntime] to requirements.txt.
  • Replaced standard AutoModelForSequenceClassification with ORTModelForSequenceClassification in bertweet_model.py.
  • The export=True flag dynamically builds the ONNX graph on initialization, ensuring zero friction for other developers (no manual .pt to .onnx conversions needed).

Impact

  • Latency: Graph optimizations significantly accelerate the pipeline() execution for timestamped audio chunks.
  • Compatibility: The input/output tuple and payload remain exactly the same. No frontend or API routing changes are required.

I ran the standard tests locally and everything passes smoothly with expected confidence intervals. Let me know if you'd like me to benchmark this against the old PyTorch implementation for the official documentation!

@kushagarwal2910-lang
Copy link
Copy Markdown
Author

kushagarwal2910-lang commented Mar 7, 2026

Hey @marcgc21 and @BasmaElhoseny01

please review this PR and if possible give me you're valuable feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: Accelerate BERTweet CPU Inference via ONNX Runtime & Hugging Face Optimum

1 participant