distribution is a package for manipulating finite discrete probability distributions.
It supports transformations of distributions, measurements and efficient sampling.
Plotting distributions as bar charts is also possible using the related distribution-plot package.
Let's fire up ghci and dive right into the code.
First things first, the one line of imports.
> import Data.DistributionLet's define the distribution of outcomes of a fair dice and compute its mean and standard deviation.
> let d6 = uniform [1 .. 6]
> mean d6
3.5
> standardDeviation d6
1.707825127659933Let's now find 20 such dice and throw them all together !
> let twentyD6 = times 20 d6What's the probability that the sum of those dice is between 60 and 180 ?
Let's find out !
probability (\ x -> x >= 60 && x <= 180) twentyD6
1672525186848683 % 1828079220031488
The probabilities are computed exactly, and so whenever probabilities are queried,
a value of type Rational is returned. To make sense of this particular value, let's just
covert it to floating point:
> fromRational it
0.9149084834626994Let's now compute a much more fancier distribution. For convenience, we will make use of the monadic interface of distributions.
> let experiment = do { n <- from d6; m <- from $ uniform [ 1 .. n ]; return (n * m) }
> let distribution = run experiment
> probabilityAt 36 distribution
1 % 36
> probabilityAt 25 distribution 
1 % 30Now, let's sample from that distribution. To do so, we must first build a Generator.
> let generator = fromDistribution distributionOnce this generator is built, we can query it in constant time.
> import Control.Monad.Random
> getSample generator 
25
> getSample generator 
36
> getSample generator 
3The distribution package contains several modules:
- Data.Distribution.Corewhich defines distributions and combinators over distributions,
- Data.Distribution.Measurewhich includes measures such as probability, expectation and variance,
- Data.Distribution.Monadicwhich includes proposes a monadic interface to manipulate distributions,
- Data.Distribution.Samplewhich implements Walker's Alias method for efficiently sampling random values from distributions.
In addition, some domain specific distributions and functions are proposed by default:
- Data.Distribution.Domain.Dicedefines dice and functions on dice.
- Data.Distribution.Domain.Coindefines the same for coins.
- Data.Distribution.Aggregatorcontains functions for transforming distributions to cumulative distributions and other useful methods.
Each module and each function is extensively documented.
See the distribution-plot package for plotting distributions to files. This package is also available on GitHub.
Copyright 2014-2017 Romain Edelmann
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.