This is a program written to compare molecular dynamics methods used to determine the MMPBSA of a set of protein complexes. The sample data included compares data obtained through molecular dynamics simulations. One set has been obtained using 'Standard' molecular dynamics protocols with a timestep of 2fs. The other set has been obtained similarly but with the implementation of hydrogen mass repartitioning ('HMR'). Both are compared to the experimentally obtained values. The program can be altered to compare a number of different methods to obtain measurements beyond molecular dynamics and MMPBSA.
To run, you will need to have Python installed, including the following necessary modules:
numpypandasmatplotlibsklearnscipy
The code is written in a Jupyter Notebook, which can be run through your preferred development environment. Alternatively, you can install Jupyter using pip or conda and run it from the command line using the following command:
jupyter nbconvert --execute Simulation_Analysis.ipynb --to notebook --output Simulation_Analysis.ipynbEdit the output file name if not wanting to overwrite the original.
To run:
-
Download the
Simulation_Analysis.ipynbnotebook as well as all your raw data/sample datasets from this repository into a single directory. The sample data includesHMR_MMPBSA.csv,Standard_MMPBSA.csvandEXP_data.csv. -
Install the necessary Python modules and Jupyter if needed.
-
Edit file and data names, plot axis and title as needed. The notebook is currently set up to process the sample data provided. Note that if using raw data that does not follow the formatting conventions of the sample data, the data processing function will need to be edited accordingly. If testing more than two methods, edit the subplot number and add plot commands as needed.
-
Run the notebook within your development environment or by using the command provided above.
- The first cell imports all relevant python modules.
- The second cell collects and organizes the datasets into one combined DataFrame.
- The third cell plots the MMGBSA values for each computational method against the experimental values, including linear regression.
- The forth cell calculates statistical metrics for both computational datasets.
- The combined data is found in
MMPBSA_avg.csv. - Visualization of the computational vs. experimental data is found in
MMPBSA_plot.png. - Statistical analysis metrics are found in
MMPBSA_metrics.csv.