Skip to content

Reducing memory footprint #213

Open
@sethaxen

Description

@sethaxen

Summary

Pathfinder stores nearly all intermediate computations for inspection purposes, which for high-dimensional targets can result in too high of a memory footprint. This issues proposes a non-breaking refactor that will significantly reduce this footprint.

Background

(Multi-path) Pathfinder runs in 2 phases:

  • Phase 1: In this phase runs L-BFGS, storing the trace of positions and gradients. Worst-case storage requirements are O(nruns * maxiters * dim)
  • Phase 2: computes the inverse Hessian approximation at each iteration, draws ndraws_elbo draws per iteration to estimate ELBO, draws ndraws_per_run draws for the ELBO-maximizing MvNormal approximation, and then stores ndraws draws. Worst-case storage requirements are O(nruns * maxiters * dim * (2 * history_length + ndraws_elbo) + nruns * ndraws_per_run * dim + ndraws * dim)

Proposal

These phases can be interleaved. In the optimization callback, we could store just the current state, as well as the ELBO-maximizing multivariate normal approximation. This would reduce worst-case storage requirements to O(nruns * dim * (2 * history_length + ndraws_elbo + ndraws_per_run) + ndraws * dim), effectively eliminating maxiters from each of the previous expressions. With default settings, this could potentially reduce the memory footprint by 1,000-fold.

I think it's possible to make these changes in a non-breaking way. The plan is to introduce a keyword argument save_trace=true, which could be switched to false to

  • avoid storing draws in ELBOEstimates,
  • avoid storing all fit_distributions, and
  • avoid storing OptimizationTraces.

In the future we may consider the breaking change of defaulting to save_trace=false.

Concretely, we would introduce internal structs with names like LBFGSState and PathfinderState and update these in-place within OptimizationCallback. The biggest changes would be to refactor many of the utility functions to mutate provided storage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions