(GERBIL) GEnerative Restricted Boltzmann machines of protein famILies

GERBIL is a tool for generative modeling of protein sequence families. It is built upon the deepnet package, created by Nitish Srivastava.

The models are described in thesis.pdf.

Currently the following generative models are supported:

Restricted Boltzmann Machines
Deep Boltzmann Machines

NOTE : Currently under development. Usage should be considered experimental.

See INSTALL.txt for installation instructions

Alternatively, you can run on AWS using the Gerbil AMI (see below)

Running on AWS

We provide a precreated AMI with all libraries and paths set.

Create a g2.2xlarge instance and use the Gerbil specific AMI ami-d18ab9e1.
Pull the latest gerbil version
```
$ cd gerbil/ && git pull origin master
```

Getting Started

Export the following variables:

$ export GERBILPATH=/path/to/gerbil
$ export PYTHONPATH=${GERBILPATH}:${PYTHONPATH}

First create the datasets. In this example we will consider the PF00240 family

$ cd deepnet/datasets
$ wget http://gremlin.bakerlab.org/fasta_2013/PF00240.fas -P /tmp/
$ python create_dataset.py --fastafile /tmp/PF00240.fas --familyname PF00240

Restricted Boltzmann Machine (RBM)

To see full range of options available
```
$ python generate_experiments.py -h
```

Create a new experiment for learning the parameters of the RBM

$ cd deepnet/experiments
$ python generate_experiments.py --start_job_id 1 \
    --model rbm --data_dir PF00240

Learn the params of the model
```
$ cd exp1 && ./runall.sh
```

Calculate imputation error. Job file written out to ./run_in_parallel.sh

$ cd deepnet/
$ python impute_parallel_run.py --start_expid 1 --end_expid 1 --model_prefix rbm
$ ./run_in_parallel.sh

Create a csv table with results

$ python create_results_csv.py --expid 1
$ less results/imperr_exp1.csv

Deep Boltzmann Machine (DBM)

A DBM needs a RBM to warmstart from. Continuing from exp1 we will create another experiment exp2 to train a DBM.

Create a new experiment

$ cd deepnet/experiments
$ python generate_experiments.py --start_job_id 2 \
    --model dbm --data_dir PF00240

Copy over the best RBM model in exp1 onto exp2

$ python choose_best_model.py --impute_dir likelihoods/exp1/ \
    --model_dir exp1/dbm_models/ --output_dir exp2/dbm_models

Learn the params of the model
```
$ cd exp2 && ./runall.sh
```

Calculate imputation error. Job file written out to ./run_in_parallel.sh

$ cd deepnet/
$ python impute_parallel_run.py --start_expid 2 --end_expid 2 --model_prefix dbm
$ ./run_in_parallel.sh

Create a csv table with results

$ python create_results_csv.py --expid 2
$ less results/imperr_exp2.csv

Extracting params from protocol buffers to .MAT files

Get the best model from an experiment

$ cd deepnet/experiments
$ python choose_best_model.py --impute_dir likelihoods/exp1/ \
    --model_dir exp1/dbm_models/ --print_only
exp1/dbm_models/rbm1_1430616201

Extract the params from the best model

$ cd deepnet/
$ python write_model_to_mat.py \
    experiments/exp1/dbm_models/rbm1_1430616201 \
    /path/to/my/folder/rbm1_1430616201.mat

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
cudamat		cudamat
deepnet		deepnet
eigenmat		eigenmat
.gitignore		.gitignore
INSTALL.txt		INSTALL.txt
LICENSE.txt		LICENSE.txt
README.rst		README.rst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

(GERBIL) GEnerative Restricted Boltzmann machines of protein famILies

Running on AWS

Getting Started

Restricted Boltzmann Machine (RBM)

Deep Boltzmann Machine (DBM)

Extracting params from protocol buffers to .MAT files

About

Uh oh!

Releases

Packages

Languages

License

smoitra87/gerbil

Folders and files

Latest commit

History

Repository files navigation

(GERBIL) GEnerative Restricted Boltzmann machines of protein famILies

Running on AWS

Getting Started

Restricted Boltzmann Machine (RBM)

Deep Boltzmann Machine (DBM)

Extracting params from protocol buffers to .MAT files

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages