[Whisper] Add labels' in the whisper output #2252

wu6u3tw · 2025-07-14T06:44:58Z

The original text output from whisper-large-v3 includes numbers and the normalization part are included in the accuracy_eval script.
Therefore, to get the digit's part of the output in the label dict. I add digits, some symbols in the labels.

github-actions · 2025-07-14T06:45:06Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

wu6u3tw · 2025-07-14T17:52:27Z

recheck

keithachorn-intel · 2025-07-15T07:27:34Z

I'm not sure this will align with the text normalization elsewhere in the reference. Numerical values were previously expanded to full words. Will have to check how the OpenAI normalizer handles numeric values.

hanyunfan · 2025-07-18T16:21:27Z

I'm not sure this will align with the text normalization elsewhere in the reference. Numerical values were previously expanded to full words. Will have to check how the OpenAI normalizer handles numeric values.

@wu6u3tw Could you review Keith’s question and provide additional context or a possible response?

keithachorn-intel

Per TF meeting, let's remove the euro, lb, and cent symbols since they are interacting strangely with VIM editor and don't appear to impact measured accuracy. Also, let's replicate this change in reference_SUT.py file.

keithachorn-intel · 2025-10-01T16:15:13Z

speech2text/accuracy_eval.py

+    "$",
+    "¢",
+    "£",
+    "€", 


Per TF meeting, let's remove the euro, lb, and cent symbols since they are interacting strangely with VIM editor and don't appear to impact measured accuracy. Also, let's replicate this change in reference_SUT.py file.

Remove euro, cent, lb sign done.

Re-reviewed and approved. Thanks!

add

keithachorn-intel

Reviewed and approved by Speech-to-text TF.

keithachorn-intel · 2025-10-01T18:50:13Z

@pgmpablo157321 - This PR is approved per the TF discussion today. I think we can close it before the next WG sync (no need for wider input).

…isper

hanyunfan

LGTM

wu6u3tw requested a review from a team as a code owner July 14, 2025 06:44

wu6u3tw force-pushed the dev-tinyinl-add_labels_in_accuracy_eval_whisper branch from bf35648 to b0faf8a Compare July 14, 2025 06:46

wu6u3tw changed the title ~~[Whisper] Add labels' in the whisper output~~ [Draft] [Whisper] Add labels' in the whisper output Jul 15, 2025

wu6u3tw mentioned this pull request Jul 15, 2025

[Whisper] Conflict of logic in accuracy_eval.py #2258

Open

keithachorn-intel reviewed Oct 1, 2025

View reviewed changes

wu6u3tw added 2 commits October 1, 2025 10:05

add numbers into labels

502f51b

add

remove euro, cent...etc

a51a425

wu6u3tw force-pushed the dev-tinyinl-add_labels_in_accuracy_eval_whisper branch from b0faf8a to a51a425 Compare October 1, 2025 18:45

wu6u3tw changed the title ~~[Draft] [Whisper] Add labels' in the whisper output~~ [Whisper] Add labels' in the whisper output Oct 1, 2025

keithachorn-intel reviewed Oct 1, 2025

View reviewed changes

Merge branch 'master' into dev-tinyinl-add_labels_in_accuracy_eval_wh…

37b1025

…isper

hanyunfan approved these changes Oct 13, 2025

View reviewed changes

hanyunfan merged commit 0b8ca03 into mlcommons:master Oct 13, 2025
29 checks passed

github-actions bot locked and limited conversation to collaborators Oct 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Whisper] Add labels' in the whisper output #2252

[Whisper] Add labels' in the whisper output #2252

Uh oh!

wu6u3tw commented Jul 14, 2025

Uh oh!

github-actions bot commented Jul 14, 2025 •

edited

Loading

Uh oh!

wu6u3tw commented Jul 14, 2025

Uh oh!

keithachorn-intel commented Jul 15, 2025

Uh oh!

hanyunfan commented Jul 18, 2025

Uh oh!

keithachorn-intel left a comment

Uh oh!

keithachorn-intel Oct 1, 2025

Uh oh!

wu6u3tw Oct 1, 2025

Uh oh!

keithachorn-intel Oct 1, 2025

Uh oh!

keithachorn-intel left a comment

Uh oh!

keithachorn-intel commented Oct 1, 2025

Uh oh!

hanyunfan left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Whisper] Add labels' in the whisper output #2252

[Whisper] Add labels' in the whisper output #2252

Uh oh!

Conversation

wu6u3tw commented Jul 14, 2025

Uh oh!

github-actions bot commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wu6u3tw commented Jul 14, 2025

Uh oh!

keithachorn-intel commented Jul 15, 2025

Uh oh!

hanyunfan commented Jul 18, 2025

Uh oh!

keithachorn-intel left a comment

Choose a reason for hiding this comment

Uh oh!

keithachorn-intel Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

wu6u3tw Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

keithachorn-intel Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

keithachorn-intel left a comment

Choose a reason for hiding this comment

Uh oh!

keithachorn-intel commented Oct 1, 2025

Uh oh!

hanyunfan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Jul 14, 2025 •

edited

Loading