Skip to content

Conversation

WardBrian
Copy link
Collaborator

While looking at #26 @bob-carpenter noticed it was possible for a model to have the parameters end up as nan and stay there forever. I tracked it down to this sequence of events:

  1. One particularly bad trajectory during warmup ends up with the returned log density being nan
  2. This nan value is used to compute an acceptance probability, which also ends up nan
  3. The step size dual averaging parameters end up getting 'poisoned' by this nan
  4. The next iteration, the step size itself is nan, which means after one gradient step, the parameters will be too.

I believe Stan traps this behavior here:
https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/hmc/nuts/base_nuts.hpp#L259-L260

Rather than do exactly what they do, I've moved this to inside the dual averaging. This moves it out of the code that we're relying on dead code optimization to handle for us when using the NoOpHandler in non-adaptive walnuts.

I believe 0 is the correct value, working through what would happen if I did exactly what Stan did and set the bad energy value to inf, the acceptance probability would work out to exp(-inf) = 0

Copy link
Collaborator

@bob-carpenter bob-carpenter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a slight generalization to the test.

* @pre alpha > 0
*/
inline void observe(S alpha) noexcept {
if (std::isnan(alpha)) {
Copy link
Collaborator

@bob-carpenter bob-carpenter Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should add || std::isinf(alpha) test to catch the -infinity case, too. I don't think we should ever see +infinity, but it could be std::isinf(alpha) && alpha < 0 or I think we can actually just compare to -infinity.

More generally, we could condition to [0, 1] by mapping NaN and anything < 0 to 0 and anything > 1 to 1. I opened another issue to test his upper bounding for adaptation---as is, we pass in Metropolis ratios here, which can be greater than 1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@bob-carpenter bob-carpenter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. thanks!

@bob-carpenter bob-carpenter merged commit 2653184 into main Sep 25, 2025
4 checks passed
@WardBrian WardBrian deleted the catch-bad-acceptance-rate branch September 25, 2025 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants