Skip to content

Conversation

tobidae-cb
Copy link
Contributor

@tobidae-cb tobidae-cb commented Aug 6, 2025

🚀 Core Features Added:

  1. AWS S3 Cloud Storage Integration

    • Added S3 service for uploading and downloading benchmark results
    • New export-to-cloud command to upload entire output directories to S3
    • S3 integration in the main benchmark runner with --enable-s3 flag
    • Automatic result upload during benchmark execution when S3 is enabled
  2. Import/Export System for Benchmark Results

    • New import-runs command to import benchmark results from local files or remote URLs
    • Support for merging metadata from different benchmark runs
  3. Enhanced Snapshot Management System

    • Optimized two-tier snapshot system for better performance
    • Initial snapshots downloaded once and reused across tests
    • Per-test snapshot copies using optimized rsync strategies
    • Remote snapshot source support for cloud-based initial snapshots
  4. Machine Information Collection

    • Added machine metadata collection (type, provider, region, filesystem)
    • Machine info included in benchmark results for better context
  5. Report Backend API Improvements

    • New Go-based REST API backend for serving benchmark data from S3
  6. Frontend Enhancements

    • Updated React frontend to work with new backend API
    • Support for displaying machine information in reports
    • Integration with cloud-stored benchmark results

🔧 Technical Improvements:

  • Optimized rsync strategies for faster snapshot copying
  • Better error handling and logging throughout the system
  • Structured configuration management for import/export operations
  • TTL-based caching for improved performance
  • Docker support for containerized deployment

@cb-heimdall
Copy link
Collaborator

cb-heimdall commented Aug 6, 2025

🟡 Heimdall Review Status

Requirement Status More Info
Reviews 🟡 0/1
Denominator calculation
Show calculation
1 if user is bot 0
1 if user is external 0
2 if repo is sensitive 0
From .codeflow.yml 1
Additional review requirements
Show calculation
Max 0
0
From CODEOWNERS 0
Global minimum 0
Max 1
1
1 if commit is unverified 0
Sum 1

@tobidae-cb tobidae-cb changed the title feat: Add download/upload capabilities with cloud storage feat: Add cloud storage integration with S3 upload/download and snapshot management Sep 12, 2025
@tobidae-cb tobidae-cb marked this pull request as ready for review September 12, 2025 18:57
@tobidae-cb tobidae-cb requested a review from meyer9 September 19, 2025 16:22
Copy link
Collaborator

@meyer9 meyer9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome! I love all the new UI elements and how you integrated machine type/etc. Is there a way to make this agnostic to S3?

It looks like we're tying the report generation to S3 now, but users may want to use other cloud storage, or something like Github Actions.

  • export: downloads and uploads to s3 - can be done using aws s3 cp outside of the benchmark tool
  • serving from s3: this is just a static file server pulling from S3 - could just use an existing solution to serve static files

One philosophy I think we should adopt here is: do one thing and do it well. I think adding S3 download/upload/serving is nice to have, but I'm not sure it makes sense to include in the external repo that other teams may use who may not use S3.

The point of serving data from static files instead of a backend is that it makes deployment super simple. All of this work seems compatible with the static file interface, so adding S3 seems slightly unnecessary.

echo "Copying reth snapshot to $DESTINATION"

mkdir -p "$DESTINATION"
./agent_init --gbs-network=$NETWORK --gbs-config-name=base-reth-cbnode --gbs-directory=$DESTINATION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this script defined? I don't think this should be in the external repo right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah I need to update this, it's using snapio

}

// GetVersion returns the version of the Reth client
func (r *RethClient) GetVersion(ctx context.Context) (string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea!

}

// Fallback to optimized rsync
if err := b.copyWithOptimizedRsync(initialSnapshotPath, testSnapshotPath); err == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this support ZFS? Previously this was handled by the setup script which handled the copying. The nice thing about that was that we could use zfs clone instead of rsync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately during testing, I couldn't use zfs in the k8s architecture without creating new daemons, e.t.c in compute

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yeah I just wanted to confirm that I can still use ZFS on my test machine with this new code

Copy link

@Pjrich1313 Pjrich1313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have all POW to contract this work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants