Skip to content

Conversation

LeoRoccoBreedt
Copy link
Contributor

@LeoRoccoBreedt LeoRoccoBreedt commented Jun 26, 2025

Description

Include a summary of the changes and the related issue.

Related to: <ClickUp/JIRA task name>

Any expected test failures?


Add a [X] to relevant checklist items

❔ This change

  • adds a new feature
  • fixes breaking code
  • is cosmetic (refactoring/reformatting)

✔️ Pre-merge checklist

  • Refactored code (sourcery)
  • Tested code locally
  • Precommit installed and run before pushing changes
  • Added code to GitHub tests (notebooks, scripts)
  • Updated GitHub README
  • Updated the projects overview page on Notion

🧪 Test Configuration

  • OS: Windows
  • Python version: 3.12
  • Neptune version: 0.14.0
  • Affected libraries with version: torch, torchvision

Summary by Sourcery

Add an "evaluate-long-runs" how-to guide that includes a training script, an asynchronous evaluation script, and a notebook example to demonstrate logging, checkpointing, and splitting long-running evaluation tasks with Neptune.

New Features:

  • Add train.py script to train a ResNet18 on CIFAR10, log training metrics and save comprehensive checkpoints to Neptune
  • Add async_evals.py script to poll saved checkpoints, compute validation accuracy, and log evaluation metrics back to the same Neptune run
  • Include a Jupyter notebook example demonstrating splitting evaluation runs into even and odd steps and resuming Neptune runs

Documentation:

  • Introduce a new how-to guide "evaluate-long-runs" with accompanying scripts and notebook for asynchronous evaluation

Copy link
Contributor

sourcery-ai bot commented Jun 26, 2025

Reviewer's Guide

Demonstrates asynchronous long-run evaluation by providing a notebook walkthrough, a training script with metric logging and checkpoint persistence, and a polling-based evaluation script that resumes runs to log validation accuracy.

Sequence diagram for async evaluation and metric logging

sequenceDiagram
    actor User
    participant TrainScript as Training Script
    participant Checkpoint as Checkpoint File
    participant EvalScript as Evaluation Script
    participant Neptune as Neptune Run

    User->>TrainScript: Start training
    TrainScript->>Checkpoint: Save checkpoint (per epoch)
    TrainScript->>Neptune: Log training metrics
    User->>EvalScript: Start evaluation script
    loop Poll for new checkpoints
        EvalScript->>Checkpoint: Detect new checkpoint
        EvalScript->>EvalScript: Load model from checkpoint
        EvalScript->>EvalScript: Evaluate on validation data
        EvalScript->>Neptune: Log evaluation metrics (resume run)
    end
Loading

Class diagram for checkpoint structure and Neptune Run usage

classDiagram
    class Run {
        +add_tags(tags)
        +log_metrics(data, step, ...)
        +get_run_url()
        +close()
        _run_id
    }
    class Checkpoint {
        run_id
        epoch
        model_state_dict
        optimizer_state_dict
        loss
        accuracy
        global_step
        model_config
        training_config
    }
    Run <.. Checkpoint : run_id
    Checkpoint o-- model_config : contains
    Checkpoint o-- training_config : contains
Loading

File-Level Changes

Change Details Files
Introduce notebook demonstrating async run evaluation split across runs
  • Add notebook with even-step logging cell
  • Add cell resuming run to log odd steps
how-to-guides/evaluate-long-runs/evaluation_runs.ipynb
Implement training script with Neptune logging and checkpointing
  • Configure ResNet18 training on CIFAR10
  • Log loss, accuracy, epoch via global_step
  • Save detailed checkpoints including run and training metadata
how-to-guides/evaluate-long-runs/train.py
Add asynchronous evaluation script polling checkpoints and logging eval metrics
  • Poll checkpoint directory for new model_state files
  • Evaluate each model on validation set to compute accuracy
  • Resume Neptune run by run_id and log eval/accuracy at the matching step
how-to-guides/evaluate-long-runs/async_evals.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant