Skip to content

Conversation

@jottakka
Copy link
Contributor

@jottakka jottakka commented Nov 5, 2025

Adding to arcade evals:
-f or --failed-only: shows only failed evals
-o or --output: write to an output txt file


Note

Adds failed-only filtering and file output to arcade evals, refactors result display to support console/file with accurate summaries, and adds tests; bumps version to 1.6.0.

  • CLI evals:
    • Add flags --failed-only,-f and --output,-o.
    • Filter results via utils.filter_failed_evaluations, passing original counts to display.
  • Display:
    • Extract _display_results_to_console; support failed_only disclaimer and original-count summary.
    • Add output_file writing (creates parent dirs) with same formatted output as console.
  • Utils:
    • New filter_failed_evaluations(all_evaluations) returning filtered data and (total, passed, failed, warned).
  • Tests:
    • Add libs/tests/cli/test_display.py and libs/tests/cli/test_main_evals.py covering display details, failed-only mode, file output, and filtering.
  • Version:
    • Bump project version to 1.6.0.

Written by Cursor Bugbot for commit baad441. This will update automatically on new commits. Configure here.

@jottakka jottakka self-assigned this Nov 5, 2025
@codecov
Copy link

codecov bot commented Nov 5, 2025

Codecov Report

❌ Patch coverage is 93.75000% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
libs/arcade-cli/arcade_cli/main.py 0.00% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@jottakka jottakka requested a review from EricGustin November 5, 2025 20:56
@jottakka jottakka closed this Dec 9, 2025
@jottakka
Copy link
Contributor Author

jottakka commented Dec 9, 2025

Moved changes to here: #689

@jottakka jottakka deleted the francisco/arcade-cli/updating-evals-to-show-only-failed branch December 20, 2025 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants