[CLI][Sugestion] Adding flags for evals to return only failed and print to output file #678

jottakka · 2025-11-05T18:41:37Z

Adding to arcade evals:
-f or --failed-only: shows only failed evals
-o or --output: write to an output txt file

Note

Adds failed-only filtering and file output to arcade evals, refactors result display to support console/file with accurate summaries, and adds tests; bumps version to 1.6.0.

CLI evals:
- Add flags --failed-only,-f and --output,-o.
- Filter results via utils.filter_failed_evaluations, passing original counts to display.
Display:
- Extract _display_results_to_console; support failed_only disclaimer and original-count summary.
- Add output_file writing (creates parent dirs) with same formatted output as console.
Utils:
- New filter_failed_evaluations(all_evaluations) returning filtered data and (total, passed, failed, warned).
Tests:
- Add libs/tests/cli/test_display.py and libs/tests/cli/test_main_evals.py covering display details, failed-only mode, file output, and filtering.
Version:
- Bump project version to 1.6.0.

^{Written by Cursor Bugbot for commit baad441. This will update automatically on new commits. Configure here.}

codecov · 2025-11-05T18:43:06Z

Codecov Report

❌ Patch coverage is 93.75000% with 4 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
libs/arcade-cli/arcade_cli/main.py	0.00%	4 Missing ⚠️

📢 Thoughts on this report? Let us know!

…only-failed

jottakka · 2025-12-09T01:27:01Z

Moved changes to here: #689

Add --failed-only and --output flags to evals command

fb35d7b

jottakka self-assigned this Nov 5, 2025

Francisco Liberal added 3 commits November 5, 2025 15:57

Add tests for display_eval_results with --failed-only and --output flags

c11a908

Add additional test cases for better coverage of display_eval_results

19199e7

Extract filtering logic to testable function and add tests

22d0ecb

jottakka requested a review from EricGustin November 5, 2025 20:56

jottakka and others added 4 commits November 13, 2025 21:33

Merge branch 'main' into francisco/arcade-cli/updating-evals-to-show-…

9bbb691

…only-failed

Merge branch 'main' into francisco/arcade-cli/updating-evals-to-show-…

c293f18

…only-failed

Merge branch 'main' into francisco/arcade-cli/updating-evals-to-show-…

8c5a096

…only-failed

Merge branch 'main' into francisco/arcade-cli/updating-evals-to-show-…

baad441

…only-failed

jottakka closed this Dec 9, 2025

jottakka deleted the francisco/arcade-cli/updating-evals-to-show-only-failed branch December 20, 2025 03:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CLI][Sugestion] Adding flags for evals to return only failed and print to output file #678

[CLI][Sugestion] Adding flags for evals to return only failed and print to output file #678

Uh oh!

jottakka commented Nov 5, 2025 •

edited by cursor bot

Loading

Uh oh!

codecov bot commented Nov 5, 2025 •

edited

Loading

Uh oh!

jottakka commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CLI][Sugestion] Adding flags for evals to return only failed and print to output file #678

[CLI][Sugestion] Adding flags for evals to return only failed and print to output file #678

Uh oh!

Conversation

jottakka commented Nov 5, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jottakka commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jottakka commented Nov 5, 2025 •

edited by cursor bot

Loading

codecov bot commented Nov 5, 2025 •

edited

Loading