Skip to content

Feature/volumecache#345

Draft
toloudis wants to merge 28 commits into
feature/release1.10cleanupfrom
feature/volumecache
Draft

Feature/volumecache#345
toloudis wants to merge 28 commits into
feature/release1.10cleanupfrom
feature/volumecache

Conversation

@toloudis
Copy link
Copy Markdown
Contributor

@toloudis toloudis commented May 3, 2026

add caching of loaded volume data using tensorstore

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a unified volume-data caching layer (RAM + optional disk) backed by TensorStore, wires volume loading paths to use it, and introduces a GUI panel to configure cache behavior.

Changes:

  • Introduce renderlib::CacheManager/CacheConfig implementing LRU RAM caching plus optional on-disk Zarr3 cache with eviction.
  • Route volume loading through FileReader::loadAndCache() from renderlib commands and app UI (open, timeline time changes).
  • Add GUI “Advanced Cache Settings” dock + persistence, and add Catch2 unit tests for RAM eviction behavior.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
test/test_cacheManager.cpp New Catch2 tests covering RAM-only cache hits/misses and LRU eviction behavior.
test/CMakeLists.txt Adds the new cache manager test to the test target.
renderlib/io/FileReader.h Removes old static preload cache; exposes loadAndCache() entry point.
renderlib/io/FileReader.cpp Implements cache lookup/store via CacheManager when loading from file/array.
renderlib/command.cpp Uses FileReader::loadAndCache() so command-driven loads participate in caching.
renderlib/ImageXYZC.h Converts constants to constexpr (used by cache sizing logic/tests).
renderlib/CacheConfig.h New cache configuration struct (enable flags, byte caps, cache dir).
renderlib/CacheManager.h New singleton cache API + internal keying, stats, and eviction interfaces.
renderlib/CacheManager.cpp Implements RAM LRU cache and disk cache using TensorStore Zarr3 + index/eviction.
renderlib/CMakeLists.txt Adds cache manager sources/headers to the renderlib target.
agave_app/agaveGui.h Adds cache settings state and dock widget pointer to the main GUI.
agave_app/agaveGui.cpp Creates the cache settings dock, loads/saves settings, applies to renderlib, and adds “clear disk cache”.
agave_app/TimelineDockWidget.cpp Switches timepoint loading to FileReader::loadAndCache().
agave_app/CacheSettings.h / .cpp New persisted settings + normalization/clamping + apply-to-renderlib logic.
agave_app/CacheSettingsWidget.h / .cpp New Qt widget to edit cache settings and pick a cache directory.
agave_app/CacheSettingsDockWidget.h / .cpp New dock wrapper hosting the cache settings widget.
agave_app/CMakeLists.txt Adds new cache settings UI and persistence sources to the app target.
Comments suppressed due to low confidence (1)

renderlib/io/FileReader.cpp:75

  • FileReader::loadAndCache declares VolumeDimensions dims; but never uses it. This is dead code and may trigger warnings; please remove it (or use it if it’s meant to validate dimensions before caching).
  VolumeDimensions dims;


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agave_app/CacheSettingsDockWidget.h
Comment thread renderlib/CacheManager.cpp
Comment thread renderlib/CacheManager.cpp
Comment thread agave_app/CacheSettingsWidget.cpp
Comment thread renderlib/CacheManager.cpp
Comment thread agave_app/TimelineDockWidget.cpp Outdated
Comment thread renderlib/CacheManager.cpp Outdated
Comment thread agave_app/CacheSettings.cpp Outdated
Base automatically changed from feature/customcolors to main May 9, 2026 01:09
@toloudis toloudis linked an issue May 9, 2026 that may be closed by this pull request
@toloudis toloudis force-pushed the feature/volumecache branch from ab52b22 to 0feb6d8 Compare May 11, 2026 21:07
toloudis and others added 17 commits May 19, 2026 14:53
evictIfNeeded mutates m_lruKeys, m_entries, and m_currentRamBytes, but
setConfig was calling it after releasing the lock. Rename to
evictIfNeededLocked to document the precondition and reacquire the lock
before the call in setConfig.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two layers of protection:

1. CacheManager writes a marker file (.agave-cache-dir) into any
   directory it manages as a disk cache root, and clearDiskCache
   refuses to delete anything unless the marker is present. This stops
   a user-typed path like "C:\" or "/home/me" from being wiped if it
   ends up in the cache_dir setting.

2. Instead of remove_all on the whole cache dir, clearDiskCache now
   only removes per-entry subdirectories (those containing a
   meta.json). Files the user happens to keep alongside the cache
   are preserved, and the marker file itself survives.

Also adds a QMessageBox confirmation in agaveGui showing the actual
applied cache path before deletion.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ImageXYZC::IN_MEMORY_BPP is size_t but the ImageXYZC constructor takes
uint32_t bpp. Cast explicitly at the call site.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CacheKey now includes the filepath's last_write_time and file_size, so
overwriting a file produces a different key and a cache miss instead of
silently returning the old image. Both fields are folded into the hash,
the equality check, and keyToString (which is stored as meta["key"] in
disk entries).

Caveats:
- Remote URLs (http://, s3:, gs:) are not stat'd; their mtime/size stay
  zero. These typically point to immutable bucket objects.
- For zarr directories the directory's last_write_time is used, which
  most filesystems update on entry add/remove at the top level but not
  necessarily on chunk rewrites in subdirs. Best-effort, not airtight.
- Image sequences key on the representative filepath only; regenerating
  a non-representative frame won't invalidate.
- Existing on-disk entries from prior versions of this branch will no
  longer match (their meta["key"] format changed) and will accumulate
  as orphans until normal eviction or a Clear Disk Cache.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds renderlib/SystemInfo.{h,cpp} with two free functions in a
SystemInfo namespace, using only std::filesystem and std::ifstream
instead of QStorageInfo/QFile so the helpers can be reused from
non-Qt code paths.

availableMemoryBytes() and availableDiskBytes() now also report
realistic numbers on Linux and macOS instead of just truly-free
pages:
  - Linux: parses MemAvailable from /proc/meminfo (kernel's own
    estimate of allocatable memory including reclaimable cache);
    falls back to sysinfo (freeram + bufferram) on older kernels.
  - macOS: includes inactive + purgeable + speculative pages
    alongside free pages, matching Activity Monitor's "Available".
  - Windows: unchanged (MEMORYSTATUSEX::ullAvailPhys).
  - Disk: std::filesystem::space().available, with a parent_path
    fallback if the path doesn't exist yet.

CacheSettings::availableMemoryBytes / availableDiskBytes deleted
in favor of the new helpers. toRenderlibConfig now also LOG_WARNINGs
when its silent clamp reduces the user's requested RAM or disk
limit, so the reduction is discoverable.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
storeToDisk now:
  - Estimates bytes up front and refuses to write images larger than
    the entire disk cap instead of evicting everything and overshooting.
  - Calls evictDiskIfNeeded(bytes) before the tensorstore write, so
    the cap is never temporarily exceeded and we don't waste a large
    write that would immediately have to be undone.
  - Cleans up the per-entry directory (remove_all) on every failure
    path — tensorstore open, tensorstore write, or meta.json write —
    so half-written entries don't leak into the cache dir.
  - Saturates instead of underflowing if the existing entry's
    bookkeeping bytes exceed m_currentDiskBytes.

evictDiskIfNeeded now:
  - Runs under a single lock_guard for the entire operation (sort,
    select, bookkeeping update, filesystem remove_all). The previous
    code released and re-acquired the lock between victim selection
    and bookkeeping mutation, leaving a window in which a concurrent
    storeToDisk could refresh the entry we were about to delete on
    disk — wiping the fresh file. Holding the lock for the whole
    operation closes that race.
  - Sorts entries by lastAccess once instead of doing an O(N) scan
    per victim, so eviction is O(N log N) total.
  - Logs a warning if remove_all fails (e.g. Windows file-handle
    contention from a concurrent loadFromDisk) — the bookkeeping is
    still updated to reflect eviction; the stale files get cleaned
    by the next clearDiskCache.

The redundant post-write evictDiskIfNeeded(config, 0) in storeToDisk
is now dropped.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a TempCacheDir RAII helper that gives each test a unique
sub-directory under the system temp dir (auto-cleaned in the
destructor), plus a diskConfig() builder and a countSubdirs() helper.

New disk-tier test cases:
  - Initializing disk cache writes the .agave-cache-dir marker.
  - Round-trip: store with disk enabled, drop RAM, find reloads
    from disk; pixel data is bit-identical and stats.diskHits is 1.
  - Image larger than maxDiskBytes is not written (no entry subdir).
  - LRU disk eviction: store 3 entries at a 2-entry cap, oldest is
    evicted and subdir count stays at <= 2.
  - clearDiskCache wipes entry subdirs but keeps the marker file.
  - clearDiskCache refuses to touch a directory whose marker has
    been removed (file count unchanged).
  - Re-pointing setConfig away and back rebuilds the disk index
    from the on-disk meta files (simulates session restart).

New mtime test case: a real file on disk is used as the cache key
source; bumping its last_write_time forward via
std::filesystem::last_write_time(path, future_time) makes a
subsequent findImage miss instead of returning the stale image.
Uses an explicit time set rather than real-sleep so the test is
deterministic on coarse-mtime filesystems.

Per-test disk usage stays in the low-KB range — sized for 4x4x4x1
uint16 images (128 raw bytes each) with caps of a few image-widths
or 1 MB.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
RAM hits short-circuit before the disk tier is consulted, so the disk
lastAccess (and its on-disk meta.json) is never refreshed on a RAM hit.
This is fine within a session — when an entry falls out of RAM, the
next access goes through loadFromDisk which bumps the disk lastAccess
right then.

The edge case is at session boundaries: an entry that stays RAM-resident
for the whole session has stale disk lastAccess, so the next session's
disk LRU may treat it as older than it really was and evict it from
disk on cold start. Spelling out the tradeoff plus the fix recipe in a
NOTE/TODO so the next person doesn't have to rederive it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds normalizeFilepath() to the CacheManager anonymous namespace and
calls it from makeKey for both the stored filepath and the statForKey
lookup, so the cache treats:

  /some/dir/foo.tif
  /some/dir/./foo.tif
  /some//dir///foo.tif
  /some/dir/x/../foo.tif

as the same key, and on Windows also folds together:

  C:/some/dir/foo.tif
  C:\some\dir\foo.tif
  c:/SOME/DIR/foo.tif

We use std::filesystem::path::lexically_normal (purely textual) rather
than weakly_canonical so that bare-name keys used by loadFromArray_4D
(e.g. "my_in_memory_array") don't pick up CWD-relative resolution and
change between calls if the process directory moves. Remote URLs
(http/s3/gs) are passed through unchanged because lexically_normal
would mangle "://" into "//". On Windows we lowercase the result since
NTFS/FAT are conventionally case-insensitive (accepting incorrect hits
on the rare case-sensitive setups like \wsl$ paths or per-directory
case-sensitivity flags).

Adds 8 sections to test_cacheManager.cpp covering the equivalence
classes above plus a negative test that genuinely distinct paths still
produce distinct keys, with Windows-only sections behind #ifdef _WIN32.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Locks in the expected Windows behavior for UNC paths:
  - \Server\Share\Path\Foo.tif, //server/share/path/foo.tif, and
    \server\share\path\subdir\..\foo.tif all collapse to one key
    (slash style + case fold + '..' collapse).
  - A different share name (\server\othershare\...) still produces a
    distinct key, so we're not over-collapsing.

The normalization path (std::filesystem::path::lexically_normal +
generic_string + lowercase on Windows) preserves UNC roots — the
leading \\ becomes // after generic_string, which Windows APIs accept
as equivalent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rDiskCache

The previous "clear()" cleared only the in-memory tier, not the disk
tier — surprising given that a sibling clearDiskCache() exists. Rename
to make the scope explicit at the call site, document both methods'
scopes on the header, and update all six test callers plus the one
SECTION title.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 2 comments.

Comment thread renderlib/CacheManager.cpp Outdated
Comment thread renderlib/CacheManager.cpp Outdated
toloudis and others added 3 commits May 26, 2026 18:25
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@toloudis toloudis changed the base branch from main to feature/release1.10cleanup May 27, 2026 01:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

preload and cache whole time series

2 participants