fix(cli): allow evaluation-only invocation without dummy seed path by shuofengzhang · Pull Request #27 · skydiscover-ai/skydiscover

shuofengzhang · 2026-03-15T18:44:40Z

What changed

Reworked CLI positional parsing to accept either:
- skydiscover-run <evaluation_file> (evaluation-only mode)
- skydiscover-run <initial_program> <evaluation_file> (seeded mode)
Replaced the ambiguous positional pair (initial_program? + required evaluation_file) with a single positional list and explicit normalization logic.
Added targeted tests in tests/test_cli_positional_paths.py for:
- single-path invocation
- two-path invocation
- rejection of invalid 3+ positional paths

Why

The previous argparse shape made the documented optional-initial-program workflow fail in normal usage: if users passed only one positional path, argparse bound it to initial_program and still required evaluation_file.
This produced a confusing UX where an "optional" positional argument could not actually be omitted.

Insight / Why this matters

Root cause: optional positional arguments that appear before a required positional are not skippable in typical CLI syntax. Argparse greedily assigns the first positional token to the first positional argument.
Why easy to miss: the parser declaration looked logically correct (nargs='?' for initial program + required evaluation file), but argparse semantics make this pattern brittle.
Tradeoff: parsing now performs an explicit cardinality check (1 or 2 positional paths) instead of relying on implicit positional binding.
Impact: users can reliably run evaluation-only workflows without dummy arguments or parser workarounds, reducing startup friction for benchmark/evaluator-centric runs.

Testing

✅ source .venv/bin/activate && pytest -q
- 14 passed in 0.59s

gemini-code-assist · 2026-03-15T18:44:58Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical usability issue in the command-line interface by overhauling how positional arguments are parsed. The change allows for more intuitive and reliable invocation of the tool, particularly for evaluation-only workflows, by moving from an ambiguous optional argument setup to explicit path handling. Additionally, it enhances the robustness of checkpoint discovery, contributing to overall system stability.

Highlights

CLI Positional Argument Parsing Reworked: The command-line interface's positional argument parsing has been refactored to explicitly handle one or two paths, resolving an issue where an optional initial program could not be omitted due to argparse's greedy behavior.
Flexible Invocation Modes: Users can now reliably invoke the CLI in two modes: skydiscover-run <evaluation_file> for evaluation-only, or skydiscover-run <initial_program> <evaluation_file> for seeded runs, without needing dummy arguments.
Robust Checkpoint Discovery: The internal _find_latest_checkpoint utility has been improved to more robustly identify the latest checkpoint directory by correctly parsing iteration numbers and ignoring non-numeric directory names.
Comprehensive Testing: New, targeted tests have been added for both the CLI's positional path parsing logic and the _find_latest_checkpoint function, ensuring the new behaviors are correctly implemented and maintained.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

skydiscover/cli.py
- Replaced separate initial_program and evaluation_file positional arguments with a single paths argument using nargs='+'.
- Implemented custom logic to interpret the paths argument, assigning values to initial_program (optional) and evaluation_file based on the number of provided paths.
- Added error handling for invalid numbers of positional paths (more than two).
- Refactored _find_latest_checkpoint to use a more robust parsing method for checkpoint directory names, ensuring only valid, numerically-suffixed directories are considered.
tests/test_cli_checkpoint_discovery.py
- Added new test cases for _find_latest_checkpoint to verify correct identification of the highest iteration checkpoint.
- Included tests to ensure _find_latest_checkpoint ignores non-numeric directories.
- Added a test to confirm _find_latest_checkpoint returns None when no valid checkpoints are found.
tests/test_cli_positional_paths.py
- Added new test cases for parse_args to validate single-path invocation correctly assigns to evaluation_file.
- Included tests to verify two-path invocation correctly assigns to initial_program and evaluation_file.
- Added a test to confirm that parse_args rejects invocations with more than two positional paths.

Activity

The author implemented a rework of CLI positional parsing to accept either one or two paths.
The author replaced the ambiguous positional pair with a single positional list and explicit normalization logic.
The author added targeted tests in tests/test_cli_positional_paths.py for single-path, two-path, and invalid 3+ positional path invocations.
The author confirmed that all tests passed with 14 passed in 0.59s.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively resolves the issue with the CLI's positional argument parsing, allowing for a more intuitive evaluation-only invocation. The approach of using a single paths argument with post-parsing normalization is a clean and robust solution. The accompanying refactor of _find_latest_checkpoint is also a great improvement, fixing a subtle bug and making the logic more resilient. Both changes are supported by clear and targeted tests. I have one minor suggestion to improve naming consistency for better maintainability.

gemini-code-assist · 2026-03-15T18:46:26Z

+    def parse_iteration(path: str) -> Optional[int]:
        try:
            return int(path.rsplit("_", 1)[-1])


For clarity and consistency with the calling code on line 296 (iteration = parse_iteration(name)), the parameter path in the nested function parse_iteration should be renamed to name. This makes it clearer that the function operates on a directory name from os.listdir, not a full file system path.

Suggested change

def parse_iteration(path: str) -> Optional[int]:

try:

return int(path.rsplit("_", 1)[-1])

def parse_iteration(name: str) -> Optional[int]:

try:

return int(name.rsplit("_", 1)[-1])

lynnliu030 · 2026-03-16T03:05:55Z

@shuofengzhang Thanks for the catch! I'm thinking whether this can be done by simply changing the argument order from <initial_solution> <eval> to <eval> <initial_solution>? Then the argument name can be more meaningful than paths, because paths can be output folders, and many other stuff.

gemini-code-assist Bot reviewed Mar 15, 2026

View reviewed changes

fix(cli): support evaluation-only positional invocation

cfb913c

shuofengzhang force-pushed the fix-cli-evaluation-only-positional branch from 9f9967c to cfb913c Compare March 15, 2026 19:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cli): allow evaluation-only invocation without dummy seed path#27

fix(cli): allow evaluation-only invocation without dummy seed path#27
shuofengzhang wants to merge 1 commit into
skydiscover-ai:mainfrom
shuofengzhang:fix-cli-evaluation-only-positional

shuofengzhang commented Mar 15, 2026

Uh oh!

gemini-code-assist Bot commented Mar 15, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 15, 2026

Uh oh!

lynnliu030 commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shuofengzhang commented Mar 15, 2026

What changed

Why

Insight / Why this matters

Testing

Uh oh!

gemini-code-assist Bot commented Mar 15, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

lynnliu030 commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants