Skip to content

Sampling distribution documentation and tailoring #221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Dietr1ch opened this issue Apr 25, 2025 · 0 comments
Open

Sampling distribution documentation and tailoring #221

Dietr1ch opened this issue Apr 25, 2025 · 0 comments

Comments

@Dietr1ch
Copy link

I played with this library this afternoon and noticed that there's a bias towards edge values like 0, ?::MAX ?::MIN and I get that they work wonders in fuzzing, but I ran into the problem that the bias ended up producing simple test cases.

I was generating a sequence of push(value)/pop operations on a heap. My approach could be simplified to,

#[derive(Arbitrary)]
struct OperationBatch {
  seed: u64,  // Fixes the push/pop sequence. (I biased towards pushing small batches)
  numbers_to_push: Vec<u16>,
}

The bias resulted in my operation sequences using mostly the same numbers on the heap, which doesn't stress the heap too much.

I ended up implementing OperationBatch::from_seed(u64) and customising the number distribution, but would appreciate documentation around default distributions and mention to helpers to tailor the distribution of values when the defaults are a bad fit.

At least from the docs around output distributions it wasn't clear to me that there's this bias nor how sharp it is. I feel that I'd have had an easier time if I ran into arbitrary_len docs, but from the README I initially thought that sprinkling a few attributes would be all I needed.


Maybe I just ran out of entropy because of a poor size_hint, but I'd expect errors instead of silently generating bad samples (Looking around might be related to #219 (comment)). Also it seems that there's a missing set of attributes to specify collection sizes that uses arbitrary_len underneath.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant