-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding distributions #106
Comments
Great idea, I had planned to create A few suggestions:
A question:
Would you like to start with creating I was also thinking of using a table of common convolutions, to produce something similar to how |
While Gaussian mixtures can, in theory, be used to approximate any smooth distribution, finding the mixture components can be computationally challenging. So I think using a numerical linear approximation of the cdf over a grid will be faster and more reliable. |
Glad you like it. I won't be able to start on this for a few weeks - lot's going on at the moment. If you are not in a rush I can probably get to it in about a month. If you want you can also start in the meantime. I think a table of common convolutions makes a lot of sense. |
No worries, I'll take this one on then. I'm hoping to submit a CRAN release in ~3 weeks (not necessarily v1.0.0), but having symbolic derivatives and some convolutions would be useful for some upstream packages. If you could review the changes before I submit to CRAN that would be appreciated! |
Happy to review |
Currently we cannot add two distributions together:
Created on 2024-04-12 with reprex v2.1.0
But this would be nice to have. The density for the sum of two iid random variables is the convolution of their densities. One way to implement this, is to calculate the convolution on a grid of values, and then create an approximation function that interpolates between values on the grid.
Here's an example function (not to be implemented as is in the package, but an example of how this works):
This function takes two distributional distributions and outputs a function that approximates the density of their convolution.
Here are a couple of examples, plotting the results of adding random generates from each distribution, and plotting the approximated density, the analytical density (if known):
Created on 2024-04-12 with reprex v2.1.0
The speed-vs-approximation error can be controlled via the N_exponent and grid limits:
Created on 2024-04-12 with reprex v2.1.0
The speed is naturally much slower than the builtin distributions, but it does not depend on the number of evaluation points, it is just about constructing the initial density function:
Created on 2024-04-12 with reprex v2.1.0
Other operations
The above should work for + and -. For multiplication and division, it can be made to work, by first taking the logarithm of each distribution, computing the convolution dendsity, and then exponentiating again
What's needed to implement
This will need a new 'dist_*' class, since the logic is different from
dist_transformed
. Something like 'dist_convolved' would do the trick, where the density function uses the convolution logic above. For efficiency, the density approximation function would have to be stored within the dist object, so that it is not recomputed every time density is used. Will have to think about how to implement the quantile, cdf and generate functionsThoughts?
The text was updated successfully, but these errors were encountered: