Skip to content

fix: prevent blocking scrapes while cache is cold#62

Merged
Odrec merged 1 commit into
virtUOS:mainfrom
dborysenko:fix/prevent-cold-cache-blocking
May 8, 2026
Merged

fix: prevent blocking scrapes while cache is cold#62
Odrec merged 1 commit into
virtUOS:mainfrom
dborysenko:fix/prevent-cold-cache-blocking

Conversation

@dborysenko
Copy link
Copy Markdown
Contributor

Summary

  • On startup, prometheus_client's default Collector.describe() calls
    collect() to discover metric names. This meant REGISTRY.register()
    triggered a full DB collection synchronously, blocking the process for
    the entire query duration before the server could accept any requests.
  • Once running, if a Prometheus scrape arrived before the background thread
    had finished its first collection cycle, collect() fell through to a
    synchronous fresh collection — blocking the scrape and racing the
    background thread, doubling DB load on every scrape until the cache
    was warm.

Three coordinated fixes:

  • Override describe() to return [], bypassing the implicit collect()
    call during REGISTRY.register() and letting the server start immediately.
  • In collect(), return empty immediately when the cache is enabled but not
    yet populated, instead of falling through to a synchronous collection.
  • Move _start_background_collection() before REGISTRY.register() so the
    cache begins warming as early as possible, shrinking the cold window.

Test plan

  • Tested on production server — metrics are still showing as previously
  • No errors in container logs, collection time unchanged

Three coordinated changes to ensure Prometheus scrapes never block on a
synchronous DB collection:

- Override describe() to return [] so REGISTRY.register() skips the
  implicit collect() call that would fire full DB queries at startup.
- In collect(), return empty immediately when cache is enabled but not
  yet populated, instead of falling through to a synchronous collection
  that would block the scrape and race the background thread.
- Start _start_background_collection() before REGISTRY.register() so
  the cache begins warming as early as possible, shrinking the cold
  window.

Co-authored-by: Cursor <cursoragent@cursor.com>
@Odrec Odrec merged commit 0b2e2ab into virtUOS:main May 8, 2026
10 checks passed
Odrec added a commit that referenced this pull request May 8, 2026
Two small follow-ups to the cold-cache fix in #62:

- Move the "Cache configuration" logger.info block out of
  _collect_loop. It sat after the `while not self._stop_collection.is_set()`
  loop, so it only printed during shutdown — which made the cache
  config invisible at startup. Moved to __init__ alongside the
  existing "Metric groups configuration" block so both configs log
  together when the collector is constructed.

- Update the collect() docstring to reflect the three-way behavior
  introduced by #62: cache-warm serves cached, cache-cold yields
  nothing, cache-disabled collects fresh.

Co-authored-by: Odrec <odrec@Odrecs-MacBook-Pro.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants