tmol
, short for TensorMol, is a faithful reimplementation of the Rosetta molecular modeling energy function ("beta_nov2016_cart") in PyTorch with custom kernels written in C++ and CUDA. Given the coordinates of one or more proteins, tmol
can compute both energies and derivatives. tmol
can also perform gradient-based minimization on those structures. Thus, ML models that produce cartesian coordinates for proteins can include biophysical features in their loss during training or refine their output structures using Rosetta's experimentally validated energy function. You can read the full wiki here.
tmol
can be used as a standalone, or as a library for RosettaFold2 or OpenFold.
Each platform has functions for constructing a PoseStack, performing operations on that PoseStack, and retreiving the structure back from tmol
.
import tmol
tmol.pose_stack_from_pdb('1ubq.pdb')
sfxn = tmol.beta2016_score_function(pose_stack.device)
scorer = sfxn.render_whole_pose_scoring_module(pose_stack)
print(scorer(pose_stack.coords))
start_coords = pose_stack.coords.clone()
pose_stack.coords[:] = start_coords
cart_sfxn_network = tmol.cart_sfxn_network(sfxn, pose_stack)
optimizer = tmol.lbfgs_armijo(cart_sfxn_network.parameters())
cart_sfxn_network.whole_pose_scoring_module(cart_sfxn_network.full_coords)
def closure():
optimizer.zero_grad()
E = cart_sfxn_network().sum()
E.backward()
return E
optimizer.step(closure)
cart_sfxn_network.whole_pose_scoring_module(cart_sfxn_network.full_coords)
tmol.write_pose_stack_pdb(pose_stack, 'output.pdb')
To setup tmol
, run:
./dev_setup
To start using tmol
, enable the conda environment with:
conda activate tmol
To use tmol
within RosettaFold2, first install tmol
into the RF2 conda environment:
# Activate your RF2 conda environment
conda install cuda -c nvidia
cd <your local tmol repository root directory>
pip install -e .
Note
This has been tested on Ubuntu 20.04 - other platforms should work but are currently untested.
Example usage from within RosettaFold2:
seq, xyz, chainlens = rosettafold2_model.infer(sequence)
pose_stack = tmol.pose_stack_from_rosettafold2(seq[0], xyz[0], chainlens[0])
xyz = tmol.pose_stack_to_rosettafold2( ... )
Note
Hydrogens and OXT coordinates from the terminal residues in RosettaFold are not preserved across the RF2<->tmol interface.
Warning
You must call torch.set_grad_enabled(True)
if you wish to use the tmol
minimizer, as by default RF2 has grad disabled during inference.
Full Openfold documentation coming soon.
output = openfold_model.infer(sequences)
pose_stack = tmol.pose_stack_from_openfold(output)
tmol
uses Test-Driven Development. If you are writing tmol
code, you should start by writing test cases for your code.
tmol uses pre-commit hooks to ensure uniform code formatting. These pre-commit hooks run clang-format
and black
. If your code needed reformatting, the initial commit will fail, and clang/black will reformat your code. You can see these changes via git diff
, and you can git add
the files to accept the new formatting before committing again.
All changes to master should be performed via pull request flow, with a PR serving as a core point of development, discussion, testing and review. We close pull requests via squash or rebase, so that master contains a tidy, linear project history.
A pull request should land as an "atomic" unit of work, representing a single set of related changes. A larger feature may span multiple pull requests, however each pull request should stand alone. If a request appears to be growing "too large" to review, utilize a stacked pull to partition the work.
We maintain an automated test suite executed via buildkite. The test suite must always be passing for master, and is available for any open branch via pull request. By default, the test suit will run on any PR