Skip to content

Update report upload logic #972

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 14, 2025

Conversation

abohoss
Copy link

@abohoss abohoss commented Apr 7, 2025

As mentioned in #971 , I created a upload_report.py file where this python version offers several improvements over the bash script including:

  1. Better Structure:
  • Object-oriented design with a clear separation of concerns
  • Methods are modular and reusable
  1. Improved Error Handling:
  • Comprehensive logging
  • Proper exception handling
  • Command execution status checking
  1. Enhanced Features:
  • Argument parsing with helpful error messages
  • Better file path handling using Path objects
  1. Better Maintainability:
  • Logical grouping of related functionality
  • Easier to extend and modify

Copy link
Collaborator

@DonggeLiu DonggeLiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @abohoss !
Could you please address the nits below?
I will create a PR to run experiment for your changes : )

default_dir = f"{os.getenv('USER')}-{datetime.now().strftime('%Y-%m-%d')}"
parser.error(f"This script needs to take gcloud Bucket directory as the second argument. Consider using: {default_dir}")
if not args.model:
parser.error("This script needs to take LLM as the third argument.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

results_dir (and the other parameters) are positional and therefore automatically required by argparse, the code inside the if not args.results_dir: block will not be executed if results_dir is missing. You can rely on argparse's built-in error handling for missing arguments instead of adding these checks manually.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I would prefer placing these arg parsing and handling in an individual function.

)
self.logger = logging.getLogger(__name__)

def run_command(self, command: list) -> bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could you please make the functions that are only used in the class private? (e.g., _func).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the following function private in the PR : _generate_training_data, _generate_report, _run_command,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!This helps the linter to report unused functions so that we can remove zombie code.

self.benchmark_set = benchmark_set
self.model = model
self.results_report_dir = Path('results-report')
self.bucket_base_path = "gs://oss-fuzz-gcb-experiment-run-logs/Result-reports"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ' for consistency in python code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it in all code except this line: self.logger.error(f"Command failed: {' '.join(command)}") ,because using single quotes here can cause a syntax issue because the inner quotes clash with the outer quotes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

self.logger.error(f'Command failed: {" ".join(command)}')

The main idea is, using ' by default and " only when necessary.

@DonggeLiu DonggeLiu changed the base branch from main to exp-972 April 10, 2025 02:46
@abohoss
Copy link
Author

abohoss commented Apr 11, 2025

Hi @DonggeLiu ,
Thanks for the insights you provided. Here the PR with the suggested changes: #980

@DonggeLiu
Copy link
Collaborator

Hi @DonggeLiu , Thanks for the insights you provided. Here the PR with the suggested changes: #980

Ah I see you already pushed the new changes here, thanks!
Let me take another look and merge this : )

Copy link
Collaborator

@DonggeLiu DonggeLiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again, @abohoss!

Some final nits:

gcs_report_dir = f"{args.sub_dir}/{experiment_name}"

# Trends report use a similarly named path.
gcs_trend_report_path = f"{args.sub_dir}/{experiment_name}.json"

# Generate a report and upload it to GCS
report_process = subprocess.Popen([
"bash", "report/upload_report.sh", local_results_dir, gcs_report_dir,
"python3", "report/upload_report.py", local_results_dir, gcs_report_dir,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I missed this last time:
Could you please change this python3 to be python_path just like in

python_path, "run_all_experiments.py", "--benchmarks-directory",

This is to ensure the correct python is used in the docker container.

self.logger.info('Final report uploaded.')


def parse_args():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add return type.

def parse_args() -> argparse.Namespace:

)
self.logger = logging.getLogger(__name__)

def run_command(self, command: list) -> bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!This helps the linter to report unused functions so that we can remove zombie code.

self.benchmark_set = benchmark_set
self.model = model
self.results_report_dir = Path('results-report')
self.bucket_base_path = "gs://oss-fuzz-gcb-experiment-run-logs/Result-reports"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

self.logger.error(f'Command failed: {" ".join(command)}')

The main idea is, using ' by default and " only when necessary.

@DonggeLiu DonggeLiu marked this pull request as draft April 11, 2025 05:40
@abohoss
Copy link
Author

abohoss commented Apr 11, 2025

Hi @DonggeLiu ,
I addressed the changes, Thanks for your time !

@abohoss abohoss marked this pull request as ready for review April 11, 2025 19:27
@DonggeLiu
Copy link
Collaborator

DonggeLiu commented Apr 14, 2025

Thanks for addressing this so promptly!
I will merge this to the experiment PR (#977) so that we can run your code before merging into main.

@DonggeLiu DonggeLiu merged commit aff2fc8 into google:exp-972 Apr 14, 2025
1 of 2 checks passed
DonggeLiu added a commit that referenced this pull request Apr 14, 2025
As mentioned in #971 , I created a `upload_report.py` file where this
python version offers several improvements over the bash script
including:

1. Better Structure:

- Object-oriented design with a clear separation of concerns
- Methods are modular and reusable

2. Improved Error Handling:

- Comprehensive logging
- Proper exception handling
- Command execution status checking

3. Enhanced Features:

- Argument parsing with helpful error messages
- Better file path handling using Path objects

4. Better Maintainability:

- Logical grouping of related functionality
- Easier to extend and modify

---------

Co-authored-by: Dongge Liu <[email protected]>
@DonggeLiu
Copy link
Collaborator

DonggeLiu commented Apr 14, 2025

Ops The linter CI reported some issues.
Given this PR has been merged and the branch is deleted, could you please address them in new branch/PR based on exp-972?

This script will help you identify the error locally pushing : )
https://github.com/google/oss-fuzz-gen/blob/main/USAGE.md#contribution-process

@abohoss
Copy link
Author

abohoss commented Apr 14, 2025

Thanks for addressing this so promptly! I will merge this to the experiment PR (#977) so that we can run your code before merging into main.

You're welcome!

@abohoss
Copy link
Author

abohoss commented Apr 14, 2025

Ops The linter CI reported some issues. Given this PR has been merged and the branch is deleted, could you please address them in new branch/PR based on exp-972?

This script will help you identify the error locally pushing : ) https://github.com/google/oss-fuzz-gen/blob/main/USAGE.md#contribution-process

Hi @DonggeLiu!
I pushed the linting changes to #988

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants