A comprehensive molecular dynamics simulation pipeline for protein-ligand systems, built with OpenMM and designed for high-throughput screening and detailed biomolecular analysis. This pipeline supports multi-stage MD simulations with advanced bond preservation, checkpoint recovery, and SLURM cluster integration.
- Multi-stage MD simulation pipeline: Warmup β Backbone restraint removal β NVT β NPT β Production
- Checkpoint-based recovery: Resume interrupted simulations from any stage
- SLURM cluster integration: High-throughput batch processing capabilities
- Multiple file format support: PDB, CIF, SDF, MOL2 with proper bond handling
- Comprehensive reporting: Forces, trajectories, thermodynamic data, and Hessians
- GPU acceleration: CUDA and OpenCL platform support
- Flexible force fields: AMBER, GAFF, OpenFF with customizable parameters
- Python: 3.7-3.12 (recommended: 3.11)
- CUDA: For GPU acceleration (optional but recommended)
- Git: For installation from source
- Conda/Mamba: For environment management
-
Clone the repository:
git clone https://github.com/maciejwisniewski-drugdiscovery/MolecularDynamicsPipeline.git cd MolecularDynamicsPipeline -
Create conda environment from YAML:
conda env create -f environment.yml conda activate molecular_dynamics_pipeline
-
Install the package in development mode:
pip install -e .
Test your installation:
# Quick test
python -c "import molecular_dynamics_pipeline; print('Installation successful!')"The validation script will check:
- Python version compatibility
- All required dependencies
- GPU/CUDA support
- Basic OpenMM functionality
plinder_dynamics/
βββ config/ # Configuration templates
β βββ plinder_parameters_bound.yaml # Bound state simulations
β βββ plinder_parameters_unbound.yaml # Unbound state simulations
β βββ plinder_parameters_metadynamics.yaml # Enhanced sampling
β βββ misato_parameters.yaml # MISATO dataset configs
β βββ simulation_parameters.yaml # Base parameters
βββ scripts/ # Execution scripts
β βββ run_simulation.py # Main simulation runner
β βββ plinder_scripts/ # PLINDER-specific scripts
β βββ misato_scripts/ # MISATO-specific scripts
βββ src/dynamics_pipeline/ # Core pipeline modules
β βββ simulation/ # MD simulation engine
β βββ data/ # Data handling and processing
β βββ utils/ # Utilities and helpers
βββ environment.yml # Conda environment specification
βββ setup.py # Package installation
The pipeline uses YAML configuration files with the following sections:
info:
system_id: "1abc_ligand_123" # Unique system identifier
simulation_id: "bound_state_md" # Simulation identifier
use_plinder_index: true # Use PLINDER database integration
bound_state: true # Bound vs unbound simulationpaths:
raw_protein_files:
- "path/to/protein.pdb" # Protein structure files
raw_ligand_files:
- "path/to/ligand.sdf" # Ligand structure files
output_dir: "path/to/output" # Output directorypreprocessing:
process_protein: true # Clean protein with PDBFixer
process_ligand: true # Process ligand with OpenFF
add_solvent: true # Add explicit solvent
ionic_strength: 0.15 # Salt concentration (M)
box_padding: 1.0 # Solvent box padding (nm)forcefield:
proteinFF: "amber14-all.xml" # Protein force field
nucleicFF: "amber14/DNA.OL15.xml" # Nucleic acid force field
ligandFF: "gaff-2.11" # Ligand force field (gaff-2.11, openff-2.0.0)
waterFF: "amber14/tip3pfb.xml" # Water model
water_model: "tip3p" # Water model name
forcefield_kwargs: # Additional FF parameters
rigidWater: true
removeCMMotion: false
hydrogenMass: 1.5 # Hydrogen mass repartitioningForce Field Options:
- Protein:
amber14-all.xml,amber14/protein.ff14SB.xml,amber99sbildn.xml - Ligand:
gaff-2.11,openff-2.0.0,openff-2.1.0 - Water:
tip3p,tip4pew,spce
Platform Configuration:
simulation_params:
platform:
type: "CUDA" # Platform: CUDA, OpenCL, CPU
devices: "0" # GPU device indices
backbone_restraint_force: 100.0 # Backbone restraint (kcal/mol/Γ
Β²)
save_forces: true # Save force data
save_hessian: false # Save Hessian matricesStage-Specific Parameters:
Each simulation stage (warmup, backbone_removal, nvt, npt, production) supports:
warmup:
init_temp: 50.0 # Initial temperature (K)
final_temp: 300.0 # Final temperature (K)
friction: 1.0 # Langevin friction (psβ»ΒΉ)
time_step: 2.0 # Integration timestep (fs)
heating_step: 100 # Steps per 1K temperature increase
checkpoint_interval: 1000 # Checkpoint frequency
trajectory_interval: 1000 # Trajectory save frequency
state_data_reporter_interval: 1000 # State data frequency-
Copy a template:
cp config/plinder_parameters_bound.yaml my_simulation.yaml
-
Edit required fields:
- Set
system_idandsimulation_id - Update file paths in
pathssection - Adjust simulation parameters as needed
- Set
-
Validate configuration:
python scripts/run_simulation.py --config my_simulation.yaml --validate-only
Single simulation:
python scripts/run_simulation.py --config config/my_simulation.yamlWith custom output directory:
python scripts/run_simulation.py \
--config config/my_simulation.yaml \
--output-dir /path/to/outputValidation mode (check config without running):
python scripts/run_simulation.py \
--config config/my_simulation.yaml \
--validate-onlyVerbose logging:
python scripts/run_simulation.py \
--config config/my_simulation.yaml \
--log-level DEBUGFor PLINDER database systems:
python scripts/plinder_scripts/run_single_plinder_simulation.py \
--plinder_id "1abc__1.00__ligand_113" \
--config config/plinder_parameters_bound.yaml \
--output-dir /path/to/outputoutput_directory/
βββ forcefields/ # Ligand topology with bonds
β βββ {ligand_name}_topology.sdf # SDF format with bonds
β βββ {ligand_name}_topology.mol2 # MOL2 format with bonds
β βββ {ligand_name}_info.yaml # Ligand metadata
βββ trajectories/ # Simulation trajectories
β βββ {system_id}_warmup_trajectory.npz # NPZ trajectory data
β βββ {system_id}_nvt_trajectory.npz # NPZ trajectory data
β βββ {system_id}_production_trajectory.npz # NPZ trajectory data
βββ checkpoints/ # Checkpoint files for recovery
β βββ {system_id}_warmup_checkpoint.dcd
β βββ {system_id}_production_checkpoint.dcd
βββ state_data_reporters/ # Thermodynamic data
β βββ {system_id}_warmup_state_data.csv
β βββ {system_id}_production_state_data.csv
βββ states/ # XML state files
βββ topologies/ # Structure files with bonds
β βββ {system_id}_warmup_topology.cif
β βββ {system_id}_production_topology.cif
βββ forces/ # Force data (if enabled)
β βββ {system_id}_production_forces.npy
βββ hessians/ # Hessian matrices (if enabled)
β βββ {system_id}_production_hessian.npy
βββ {system_id}_init_complex.cif # Initial system structure
For questions and support:
- Issues: GitHub Issues
- Email: [email protected]