Skip to content

Contribution ideas

Aaron Virshup edited this page Oct 3, 2016 · 6 revisions

After a few requests, this page was created to provide some entrypoints into the MDT source code.

If you're interested in diving into the code, this a very incomplete wishlist of small-to-medium-sized projects to help get you situated. NOTE: this is not complete, even from our perspective, and we're always happy to discuss other ideas - especially if MDT can be enhanced to help with your own interests. Drop us a line: [email protected]

Science

  1. RNA navigation - MDT has the capability to traverse many DNA and amino acid polymers, using methods such as residue.next(), residue.is_c_terminal, chain.n_terminus, and chain.fiveprime_end. Additionally, the atoms in these polymers can be grouped via residue.backbone and residue.sidechain. We need the same capabilities for RNA!
    Relevant source files:
  1. Electronic analysis - MDT is able to both calculate and visualize wavefunction amplitudes; it should do the same for total electron density. From that point, it would be great to do the same for other electron-density-based information like Fukui Functions and non-covalent-interaction surfaces.
    Relevant source files:
  • orbitals/basis.py - In particular, BasisSet.__call__ calculates basis function values (or wavefunction amplitude if given orbital coefficients)

Software engineering

  1. Run MDT on an academic cluster - right now, MDT will run jobs locally on your laptop, or on the cloud. How can users take advantage of academic HPC clusters? The solution will likely involve adding the following capabilities to Autodesk/py-cloud-compute-cannon, support for singularity containers, and choosing an external library (such as dask that abstracts interactions with different job schedulers.
    Relevant source code:
  • pyccc/engines - definitions for the different scheduling systems we use to run jobs
  1. Interface architecture - how can we make it as easy as possible to integrate new methods into MDT? For instance, if someone wanted to run and analyze LAMMPS dynamics from MDT, what's the most intuitive, least effort way of doing so? One possibility involves using CookieCutter github respositories as templates.
    Example source code:
  • interfaces/openbabel.py - simple interface to a python utilities (that doesn't require the tool to be installed locally)
  • interfaces/ambertools.py - simple interface to a collection of command line utilities in a Docker containers
  • models/openmm.py - a more involved interface for calculating molecular properties (mostly potential energies)
  1. Making sure the methods work - MDT has unit- and end-to-end-testing for much of its functionality, but we need tests to make sure external methods are integrated and working correctly. This means testing whether an RHF/STO-3G calculation returns the "right" result and whether our ForceField packages all return the same energy for a given forcefield and geometry.

Integrations

These are methods and pieces of software that should probably be pulled in from a mature package, rather than re-implementing them. It's a high priority for us to make integrating new software as simple as possible, but for now it's somewhat involved. Most integrations will require:

Integrating new computation methods will always be a big focus:

  1. QM laundry list: semiempirical methods; support density fitting; allow users to create and configure hybrid density functionals; MCSCF with gradients; coupled cluster methods
  • models/pyscf.py - PySCF interface (note: this interface is a mess and needs refactoring, which would be another very helpful contribution)
  1. MM laundry list - CHARMM forcefields, OPLS forcefields, AMOEBA forcefields, QM/MM
    Examples:
Clone this wiki locally