feat: write evolution_trace.json at each checkpoint#28
Conversation
Every checkpoint directory now contains evolution_trace.json alongside metadata.json. The file lists every program the search generated, sorted by iteration_found, with id, score, metrics, parent_id, timestamp, and solution — making it straightforward to plot score trajectories, inspect lineage, or replay a run without walking individual program files. Closes skydiscover-ai#17
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the checkpointing mechanism by introducing an Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable feature by writing an evolution_trace.json file at each checkpoint, which will greatly simplify the analysis of discovery runs. The implementation is clean, and it's accompanied by a comprehensive set of unit and integration tests, ensuring the new functionality is robust. I have one minor suggestion to make the code more idiomatic.
mert-cemri
left a comment
There was a problem hiding this comment.
It looks mostly good, some concerns:
-
Including the full solution (source code) for every program makes the trace file potentially very large. For a run with hundreds of programs, this could be megabytes of JSON. Is this intended? Alternatively, consider making solution inclusion optional or providing a separate "compact" trace without solutions.
-
Test: test_trace_no_metrics_score_is_none creates Program with no required defaults
prog = Program(id="x", solution="pass", metrics={})
This relies on all other Program fields having defaults. It works today but is fragile if Program adds a required field later. Using the _make_program helper (with score=0) would be more consistent.
|
@mert-cemri thanks for the review!
|
|
I noticed that the current implementation does not persist the user prompt and system message. To address this, I have modified the code based on the changes proposed in this PR. Below is an example of the updated JSON structure (values omitted for brevity): {
"id": "...",
"iteration_found": "...",
"generation": "...",
"score": "...",
"metrics": {
"c5_bound": "...",
"combined_score": "...",
"n_points": "...",
"eval_time": "..."
},
"parent_id": "...",
"timestamp": "...",
"solution": "...",
"prompts": {
"diff_user_message": {
"system": "...",
"user": "...",
"responses": "..."
}
}
}By the way, is there any specific reason why you do not save the user prompt? |
|
No particular reason: please feel free to add it! @passing2961 if you’d like to open a new PR, that’s totally welcome. We can close this one and merge the restructured version instead. Since you’re more familiar with this feature, happy to defer to your implementation 🙂 |
|
@lynnliu030 I will open a new PR as soon as possible! |
|
@passing2961 Thanks!! any progress on this? |
|
Is this still planned? I think it's a needed feature. @passing2961 @lynnliu030 |
Closes #17
Writes
evolution_trace.jsoninto each checkpoint directory alongside the existingmetadata.json. The file lists every program in iteration order with id, score, metrics, parent_id, timestamp, and solution — so the full run history is readable from a single file without walkingprograms/*.json.No changes to existing behaviour or artefacts. 16 tests added.