Skip to content

Conversation

@BradleyBooth
Copy link

Added

  • New autoencoder model in models.py - AE_float32
    When Baler compressed float32 data using the default AE model, it resulted in compressed files larger than the original. This was due to the layers all being hardcoded to float64. Using this model with float32 data avoids the issue.

    • New model inherits from existing AE model class.
    • Linear layers are modified to use dtype=torch.float32
    • To utilise the model add c.float_dtype = "float32" to the project config file.
  • Lossy Compression Comparison functionality (compare.py)
    New baler operating mode defined in baler.py to benchmark baler performance on the current project against a selection of lossy compression approaches

    • To access, run baler using --mode compare

/external/*

# Exclude results tracking files
green_code_tracking.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
green_code_tracking.txt
*.txt
*.npz
*.dat
*.png
*.root
*.jpg
*.jpeg
*.log

Are there any .txt files we would want to track? Maybe a catch-all for results/data/log files would be better

helper.create_new_project(workspace_name, project_name, verbose)
elif mode == "train":
perform_training(output_path=output_path, config=config, verbose=verbose)
perform_training(output_path, config, project_name, verbose)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure project_name is always provided? Do we have a default? Just thinking this would break compatibility with an old script if there isn't a default, and this isn't defined

@@ -0,0 +1,380 @@
# Copyright 2022 Baler Contributors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copyright 2022 Baler Contributors
# Copyright 2022-2025 Baler Contributors

And similar for other files

@jlsmith-hep
Copy link
Contributor

Hi @BradleyBooth , nice PR! Left a couple small comments, but I have a bigger one - there's a lot of refactoring into new methods and classes - are we sure this doesn't affect functionality? Is there any validation to look at, e.g. running a bundled example like CMS or CFD and seeing that they give the same output?

@BradleyBooth BradleyBooth marked this pull request as draft July 27, 2025 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants