Description
Summary
Pathfinder stores nearly all intermediate computations for inspection purposes, which for high-dimensional targets can result in too high of a memory footprint. This issues proposes a non-breaking refactor that will significantly reduce this footprint.
Background
(Multi-path) Pathfinder runs in 2 phases:
- Phase 1: In this phase runs L-BFGS, storing the trace of positions and gradients. Worst-case storage requirements are
O(nruns * maxiters * dim)
- Phase 2: computes the inverse Hessian approximation at each iteration, draws
ndraws_elbo
draws per iteration to estimate ELBO, drawsndraws_per_run
draws for the ELBO-maximizing MvNormal approximation, and then storesndraws
draws. Worst-case storage requirements areO(nruns * maxiters * dim * (2 * history_length + ndraws_elbo) + nruns * ndraws_per_run * dim + ndraws * dim)
Proposal
These phases can be interleaved. In the optimization callback, we could store just the current state, as well as the ELBO-maximizing multivariate normal approximation. This would reduce worst-case storage requirements to O(nruns * dim * (2 * history_length + ndraws_elbo + ndraws_per_run) + ndraws * dim)
, effectively eliminating maxiters
from each of the previous expressions. With default settings, this could potentially reduce the memory footprint by 1,000-fold.
I think it's possible to make these changes in a non-breaking way. The plan is to introduce a keyword argument save_trace=true
, which could be switched to false
to
- avoid storing draws in
ELBOEstimate
s, - avoid storing all
fit_distributions
, and - avoid storing
OptimizationTrace
s.
In the future we may consider the breaking change of defaulting to save_trace=false
.
Concretely, we would introduce internal structs with names like LBFGSState
and PathfinderState
and update these in-place within OptimizationCallback
. The biggest changes would be to refactor many of the utility functions to mutate provided storage.