Improve scraping on recent releases by kekkokk · Pull Request #472 · g0ldyy/comet

kekkokk · 2026-01-11T14:52:57Z

Improve scraping on recent releases with a more aggressive TTL and fallback to live search if no debrid cached results are returned

#466

Summary by CodeRabbit

Release Notes

New Features
- Added intelligent cache management that adapts Time-To-Live (TTL) based on content release date for recently released items.
- Introduced configurable fallback cache behavior to trigger more frequent searches when torrents lack debrid caching.
- Enhanced content availability through dynamic cache refresh logic.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-11T14:53:06Z

Walkthrough

This PR adds release-date-aware caching logic for torrent searches. It introduces three new configurable TTL parameters and implements conditional cache freshness checks based on content release recency and debrid availability, with fallback scraping when needed.

Changes

Cohort / File(s)	Summary
Configuration & Settings `.env-sample`, `comet/core/models.py`	Added three new optional configuration parameters: `LIVE_TORRENT_CACHE_TTL_RECENT_RELEASE` (TTL override for recent releases), `RECENT_RELEASE_DAYS` (threshold for classifying recent content), and `LIVE_TORRENT_CACHE_TTL_NO_DEBRID` (fallback TTL when no debrid cached torrents exist).
Stream Endpoint Logic `comet/api/endpoints/stream.py`	Implemented release-date-aware TTL computation by fetching release dates from cache and calculating effective TTL based on recency. Added fallback logic triggered when no release date is available and no debrid cached torrents are found, using distributed locking to coordinate concurrent scrape attempts. Includes post-scrape cache re-validation and enhanced logging for scraping decisions.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client/API
    participant Cache as Release<br/>Date Cache
    participant Debrid as Debrid<br/>Service
    participant Lock as Distributed<br/>Lock
    participant Scraper as Live<br/>Scraper
    participant TorrentCache as Torrent<br/>Cache

    Client->>Cache: Fetch release_date for media_id
    Cache-->>Client: Return release_date (if available)
    
    alt Release Date Available
        Client->>Client: Compute effective_cache_ttl<br/>based on RECENT_RELEASE_DAYS
    else No Release Date
        Client->>Client: Use standard LIVE_TORRENT_CACHE_TTL
    end
    
    Client->>TorrentCache: Check cache freshness<br/>using effective_cache_ttl
    TorrentCache-->>Client: Return cached_count
    
    alt Cache Stale & No Debrid Cached Torrents
        Client->>Lock: Acquire distributed lock
        Lock-->>Client: Lock acquired/waiting
        Client->>Scraper: Initiate live scrape
        Scraper->>Debrid: Search for torrents
        Debrid-->>Scraper: Return results
        Scraper->>TorrentCache: Update cache
        TorrentCache-->>Scraper: Confirm
        Scraper-->>Client: Scrape complete
        Client->>Lock: Release lock
        Client->>TorrentCache: Re-check cache<br/>after scrape
        TorrentCache-->>Client: Return updated counts
    else Cache Valid or Debrid Torrents Present
        Client-->>Client: Use existing cache
    end
    
    Client-->>Client: Return results with appropriate TTL

Possibly related PRs

feat: add DEBRID_CACHE_CHECK_RATIO setting and update availability ch… #395: Modifies debrid-availability checks and cached_count logic in the same stream endpoint, introducing threshold-based caching decisions alongside this PR's release-aware TTL approach.
merge dev into main #386: Adjusts live-torrent cache TTL handling in stream.py and cache-related settings in models.py, directly overlapping with this PR's configuration and caching logic.
feat: refactor live torrent caching to differentiate between displaying existing results and triggering new scrapes, and update default cache TTLs #435: Modifies cache freshness validation and scrape-trigger behavior in stream.py, affecting the same control flow paths as this PR's TTL computation and fallback logic.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: introducing release-date aware TTL logic and no-debrid fallback scraping to improve scraping behavior for recent releases.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

comet/api/endpoints/stream.py (1)
540-616: Consider extracting the live scrape + debrid re-check pattern into a helper function.

This fallback block duplicates logic from earlier in the function (lock acquisition, scraping, debrid availability check). The complexity is manageable but could benefit from extraction for maintainability.

Additionally, the cached_count == 0 check on line 572 is redundant since it's already ensured by the outer condition on line 544.
♻️ Suggested simplification for line 572
-                if cached_count == 0 and not needs_scraping and not lock_acquired:
+                if not needs_scraping and not lock_acquired:

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 73e383e and 1f0ebbb.

📒 Files selected for processing (3)

.env-sample
comet/api/endpoints/stream.py
comet/core/models.py

🧰 Additional context used

🧬 Code graph analysis (1)

comet/api/endpoints/stream.py (4)

comet/core/db_router.py (1)

fetch_val (84-87)

comet/services/lock.py (3)

DistributedLock (9-113)

acquire (25-81)

release (83-94)

comet/services/orchestration.py (2)

scrape_torrents (59-105)

get_cached_torrents (107-156)

comet/services/debrid.py (1)

check_existing_availability (78-116)

🔇 Additional comments (4)

.env-sample (1)

74-89: LGTM!

The new environment variables are well-documented with clear explanations of their purpose and interaction with existing TTL settings. The "disabled by default" approach is appropriate for optional features that change caching behavior.

comet/core/models.py (1)

45-47: LGTM!

The new configuration fields are correctly typed as Optional[int] with None defaults, matching the .env-sample documentation. Logical placement alongside related TTL settings.

comet/api/endpoints/stream.py (2)

283-300: LGTM!

The effective TTL is correctly integrated into the fresh cache count query. The outer condition properly checks if the TTL feature is enabled (>= 0), while the inner query uses the computed effective_cache_ttl which accounts for recent releases.

249-277: No issue found. The release_date is correctly stored as a Unix timestamp (BIGINT in the database). It's populated via datetime.strptime(release_date_str, "%Y-%m-%d").timestamp() in comet/metadata/filter.py (line 76-77), which converts date strings to Unix timestamps in seconds. The calculation (time.time() - release_date) / 86400 in line 268 is correct for converting the difference to days.

Improve scraping on recent releases

1f0ebbb

coderabbitai bot reviewed Jan 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve scraping on recent releases#472

Improve scraping on recent releases#472
kekkokk wants to merge 1 commit intog0ldyy:mainfrom
kekkokk:feature/dynamic_ttl

kekkokk commented Jan 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 11, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

kekkokk commented Jan 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kekkokk commented Jan 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 11, 2026 •

edited

Loading