Skip to content

smithhenryd/cgm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Calibrating Generative Models (CGM)

Code for "Calibrating Generative Models" by Henry D. Smith, Nathaniel L. Diamant, and Brian L. Trippe

Preprint

We propose two lightweight, general-purpose algorithms, CGM-relax and CGM-reward, for fine-tuning generative models to match distribution-level constraints. We demonstrate that these algorithms apply to diverse model classes, data, and constraint types. Across all experiments, we find that CGM significantly reduces the constraint violation of the base model, while maintaining the fidelity of samples generated by the model.


Calibrating the Genie2 protein structure diffusion model to secondary structure statistics of natural proteins (CATH domains).

Getting Started

You can try out the cgm codebase by opening our demo notebook gmm_example.ipynb in Google Colab [link]. Alternatively, you can clone the cgm Github repository and follow our installation instructions.

Installation

Package manager

We recommend using conda or mamba to install the cgm requirements. mamba can be installed by following these instructions, which amount to the following:

curl -L -O https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
chmod +x Miniforge3-Linux-x86_64.sh
./Miniforge3-Linux-x86_64.sh

Environment install

The cgm environment can be installed from the environment file:

mamba create -f env.yml

Once you have activated the cgm environment, install the cgm package (from the root directory of this repository):

python -m pip install -e .

To use cgm in the demo notebook, you also have to install cgm as an ipykernel:

python -m ipykernel install --user --name=cgm

You can verify that your installation is correct by running the tests, or by running the demo notebook gmm_example.ipynb.

Usage

To perform fine-tuning with CGM-relax or CGM-reward, you will first need to implement a subclass MyModel of Model, which is contained in cgm/model.py. Model is an abstract base class that represents the generative model to be calibrated. It has two methods that must be overridden: - sample: draws samples from the generative model - log_p: evaluates the log probability of samples from the generative model An example implementation for continuous-time diffusion models, NeuralSDE is given in neural_sde/neural_sde.py.

Once you have implemented MyModel, you will then need to load or train your base model base_model as an instance of MyModel. You are then prepared to calibrate base_model using CGM-relax

from cgm.cgm import calibrate_relaxed

relax_model = calibrate_relaxed(
    base_model
    h,
    hstar,
    lambd,
)

or CGM-reward

from cgm.cgm import calibrate_reward

reward_model = calibrate_reward(
    base_model
    h,
    hstar,
    N_samps,
)

For a full demonstration of the package functionality, see our example reweighting mixture proportions in a GMM.

Tests

Make sure the cgm environment is activated. Then run

python -m pytest tests

About

Code for "Calibrating Generative Models" by Henry Smith, Nathaniel Diamant, and Brian Trippe

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •