Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 22 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
- [Contributing to Development](#contributing-to-development)


##Introduction
## Introduction

Stateline is a framework for distributed Markov Chain Monte Carlo (MCMC) sampling written in C++. It implements random walk Metropolis-Hastings with parallel tempering to improve chain mixing, provides an adaptive proposal distribution to speed up convergence, and allows the user to factorise their likelihoods (eg. over sensors or data). For a brief introduction to these concepts, see the [MCMC Sampling primer](#primer-mcmc-sampling) below.

Expand Down Expand Up @@ -79,7 +79,7 @@ Here Ti is the temperature of the i’th temperature chain, 𝛷 is the target d
Chain convergence can be inferred by independently running multiple MCMC chains (stacks) and comparing their statistical measures. If the chains are exploring a different set of modes, this can be detected. Otherwise we must assume they are adequately mixing, although there is a possibility that all the chains have failed to discover a mode (parallel tempering reduces the probability of this happening). Stateline employs the approach of [Brooks98](#references).


###Why Stateline
### Why Stateline

Stateline is designed specifically for difficult inference problems in computational science. We assume that a target distribution may be highly non-Gaussian, that the data we are conditioning on is highly non-linearly related to the model parameters, and that the observation models might be expensive ‘black box’ functions such as the solutions to numerical simulations. Numerous innovative technical capabilities have been incorporated into the Stateline codebase, specifically to improve usability and functionality in scientific applications:

Expand Down Expand Up @@ -140,7 +140,7 @@ Detailed logging is available (even when the system is deployed on a cluster):
Finally, Stateline's [output](#mcmc-output) is provided in csv format, so it is simple to load and analyse. The output is written in intermediate steps in case of early termination.


##System Requirements
## System Requirements

Stateline has been sucsessfully compiled on Linux and OSX machines. We don't currently support Windows. For large-scale deployments, we recommend using Docker (and the dockerfile included in this repo).

Expand All @@ -163,7 +163,7 @@ To run the python demos, you will also need:
* numpy
* corner-plot (python library)

##Installation
## Installation

First clone the repository and create a directory in which to build it:

Expand Down Expand Up @@ -205,9 +205,9 @@ $ make install

which will output headers, libraries and binaries into an `install` subdirectory of the build directory. From there you may copy them to the appropriate folders in your operating system.

##Getting Started
## Getting Started

###Configuration
### Configuration

Stateline is configured through a json file. An example file is given below:

Expand Down Expand Up @@ -252,7 +252,7 @@ Stateline is configured through a json file. An example file is given below:

`loggingRateSec`: The number of seconds between logging the state of the MCMC. Faster logging looks good in standard out, slower logging will save you disk space if you're redirecting to a file.

###C++ Example
### C++ Example


The following code gives a minimal example of building a stateline
Expand Down Expand Up @@ -331,7 +331,7 @@ Then in a new terminal, run one or more workers:
$ ./demo-worker
```

###Python Example
### Python Example

The following code gives an close to minimal example of building a stateline
worker with a custom likelihood in Python.
Expand Down Expand Up @@ -408,12 +408,12 @@ Then, in another terminal, run one or more workers:
$ python ./demo-worker.py
```

###Other Languages
### Other Languages

For details of implementing workers for other languages, see [Workers in Other Languages](#workers-in-other-languages).


##Interpreting Logging
## Interpreting Logging

While stateline is running, a table of diagnostic values are printed to the console. For cluster deployments, this output is to stdout, and can be piped over ssh using ncat (the Clusterous demo provides an example of how to do this). The table will look something like the demo's output below:

Expand Down Expand Up @@ -458,7 +458,7 @@ Use this as a diagnostic to ensure that a chain is achieving an effective rate (

##### Beta

Beta is the inverse temperature. Specifically, the chain with a particular Beta `sees' the probability distribution raised to the power of Beta, making the distribution increasingly uniform as it approaches 0. Like Sigma, the Beta values are generated per-tier, but only updated on a swap allowing them to be slightly different at any given time to their equivalent chains in other stacks. Beta is adapted as a strictly decreasing ladder, with the base chains at a constant 1.0, targeting a desired swap rate (0.4 in this case).
Beta is the inverse temperature. Specifically, the chain with a particular Beta 'sees' the probability distribution raised to the power of Beta, making the distribution increasingly uniform as it approaches 0. Like Sigma, the Beta values are generated per-tier, but only updated on a swap allowing them to be slightly different at any given time to their equivalent chains in other stacks. Beta is adapted as a strictly decreasing ladder, with the base chains at a constant 1.0, targeting a desired swap rate (0.4 in this case).

##### SwapRt, GlbSwapRt

Expand All @@ -470,7 +470,7 @@ The convergece test of [Brooks98](#references) is applied between stacks when po



##MCMC Output
## MCMC Output

Stateline outputs raw states in CSV format without removing any for burn-in or
decorrelation. The format of the csv is as follows
Expand Down Expand Up @@ -502,7 +502,7 @@ will launch a Python script that visualises the samples of the first chain. You'
Viewing the raw histograms of the parameters is informative for a low dimensional problem like this demo.


##Cluster Deployment
## Cluster Deployment

Stateline is designed to take advantage of many computers performing likelihood evaluations in parallel. The idea is to run a server on a single machine and many workers communicating with the server over TCP. Workers can be ephemeral -- if a worker dissapears mid-job that job will be reassinged to another worker by the server (after a few seconds). At the moment the server does not support recovering from early termination, so place it on a reliable machine if possible. The server also needs at least 2 cores to work effectively, so provision it with decent hardware.

Expand All @@ -512,7 +512,7 @@ There is a Dockerfile ready to go which has both the server and the worker
built. Feel free to use this as a base image when deploying your code.


##Tips and Tricks
## Tips and Tricks

This section addresses some common questions about configuring and using
Stateline for a scientific problem:
Expand Down Expand Up @@ -598,7 +598,7 @@ can form a criterion for selecting the number of temperature tiers (see below).

##### How many temperature tiers should I use?

If a high temperature chain has a large sigma and a higher-than-targeted accept rate, as seen in chains 4 and 9 of the example logging, this suggests that the high temperature distribution is becoming uniform. The proposal is using the `bouncy bounds' to essentially draw indepenent random samples from the input space, and they are still geting accepted. This is not a problem, but does suggest there will be little further benefit in adding additional temperature tiers.
If a high temperature chain has a large sigma and a higher-than-targeted accept rate, as seen in chains 4 and 9 of the example logging, this suggests that the high temperature distribution is becoming uniform. The proposal is using the 'bouncy bounds' to essentially draw indepenent random samples from the input space, and they are still geting accepted. This is not a problem, but does suggest there will be little further benefit in adding additional temperature tiers.

After the betas have adapted, you want the tiers to span all the way from the
true distribution (Beta=1) to a uniform distribution (Beta -> 0). Thus, we
Expand Down Expand Up @@ -635,7 +635,7 @@ code to run models on the sampled parameters. This enables marginalisation of
derived properties of the model outputs with respect to the parameters.


##Workers in Other Languages
## Workers in Other Languages

Creating in a worker in a language other than C++ should be fairly simple as long as that library has access to ZeroMQ bindings. For the impatient, the approach is the same as the Python example given above. The way other language bindings work is to run a copy of `stateline-client` for every worker, then each worker communicates with its stateline-client via a local unix socket using ZeroMQ. This means all the complex logic for handling job requests, server heartbeating and asynchronous messages are invisible, leaving only a very simple loop. In pseudocode:

Expand All @@ -651,15 +651,15 @@ while working:
send 'goodbye' message to stateline-client
```

###stateline-client
### stateline-client
The `stateline-client` binds (in the ZeroMQ sense) to the socket given in its argument. This socket cannot already exist. For example:

```bash
$ ./stateline-client -w ipc:///tmp/my_socket.sock
```
binds the stateline-client to `/tmp/my_socket.sock`. The general form is `ipc://<filesystem_path>`. Note that, as in the Python example, if you intend to run many copies of your worker script you will need some way to randomise the socket name each instance of stateline-client doesn't conflict. Remember that's 1 stateline-client *per worker*, even if they're on the same machine.

###ZeroMQ
### ZeroMQ


Create a ZeroMQ context and a `dealer` socket. Then connect it to the socket given to stateline-client. Now you are ready to send the `hello` message. This is a multi-part message of the following form (and noting that all parts must be c-type strings):
Expand Down Expand Up @@ -699,17 +699,17 @@ Finally, if you would like to cleanly disconnect the worker (not required-- the

Here "5" is the stateline code for the message subject `GOODBYE`.

##Contributing to Development
## Contributing to Development

Contributions and comments are welcome. Please read our [style guide](https://github.com/NICTA/stateline/wiki/Coding-Style-Guidelines) before submitting a pull request.

###Licence
### Licence
Please see the LICENSE file, and COPYING and COPYING.LESSER.

###Bug Reports
### Bug Reports
If you find a bug, please open an [issue](http://github.com/NICTA/stateline/issues).

###References
### References

G. Altekar et al. (2004), Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference, Bioinformatics, Vol 20 No. 3, pp 407-415.

Expand Down