Skip to content

feat: Per-user custom catalog addon support#550

Open
jemitoburt wants to merge 5 commits intog0ldyy:mainfrom
jemitoburt:main
Open

feat: Per-user custom catalog addon support#550
jemitoburt wants to merge 5 commits intog0ldyy:mainfrom
jemitoburt:main

Conversation

@jemitoburt
Copy link

@jemitoburt jemitoburt commented Mar 1, 2026

Summary

Adds the ability for each Comet user to configure one or more external Stremio-compatible catalog addons directly on the /configure page. Custom catalog prefixes (e.g. csfd) are routed to the user-specified addon URL, while the tt prefix continues to be handled by Cinemeta as normal.

Motivation

Comet users running adjacent addons (e.g. Kinobox) want to browse those catalogs and play streams through Comet's debrid infrastructure. Previously this required server-side env var configuration and a restart. This PR makes it a zero-config per-user setting stored in the base64 config URL.

Changes

comet/templates/index.html — Configure page UI

A new "Custom Catalogs" collapsible section is added to the configure page. Users can:

  • Add any number of catalog entries with Catalog URL + ID Prefix fields
  • Remove entries with a trash icon
  • Entries are saved into the b64 config and restored when re-opening configure

The customCatalogs array ([{url, prefix}, ...]) is included in all generated manifest URLs and is restored when the user reconfigures.


comet/core/models.pyConfigModel

customCatalogs: Optional[List[dict]] = []
  • New field on ConfigModel so the per-user b64 config preserves custom catalog entries (previously they were silently dropped during validation)
  • Pydantic field_validator sanitizes each entry: strips whitespace, removes entries with empty URL or prefix, and blocks overriding built-in prefixes tt and kitsu

comet/api/endpoints/manifest.py — Manifest endpoint

When the user has customCatalogs configured:

  • Adds catalog entries to the manifest (cstm{idx}_{prefix}_movie, cstm{idx}_{prefix}_series) so Stremio shows the catalogs
  • Adds custom idPrefixes to the stream resource so Stremio sends stream requests for those IDs to Comet (e.g. Stremio will call /stream/movie/csdf12345.json on Comet instead of ignoring it)
  • No changes to manifest output when customCatalogs is empty — fully backward-compatible

comet/api/endpoints/custom_catalog.py — New file

New FastAPI router with:

Endpoint Purpose
GET /{b64config}/catalog/{type}/{id}.json Proxy catalog page requests to user addon
GET /{b64config}/catalog/{type}/{id}/{extra:path}.json Proxy with extra params (search, skip, genre…)

Catalog ID format: cstm{idx}_{prefix}_{type} → mapped back to {prefix}_{type} on the upstream addon.

Also exports resolve_custom_prefix_to_imdb(media_type, media_id, custom_catalogs) — calls /meta/{type}/{base_id}.json on the addon and tries several common response paths to extract the IMDB tt... ID. Note: Suffixes like :1:2 (used for episodes) are automatically stripped before calling the custom catalog's /meta endpoint to prevent 404s.

Uses the shared http_client_manager (consistent with rest of codebase).


comet/api/endpoints/stream.py — Stream endpoint

Before calling parse_media_id, the stream handler now checks if media_id starts with any of the user's custom prefixes. If so:

  1. Calls resolve_custom_prefix_to_imdb to get the IMDB ID
  2. If resolved → replaces media_id with tt... and continues the normal stream flow
  3. If not resolved → returns empty stream list (graceful failure, no crash)

The tt prefix path is completely unchanged.


comet/api/app.py

Imports custom_catalog and registers its router in stremio_routers.


Backward Compatibility

Scenario Behavior
No customCatalogs in user config Manifest unchanged, stream unchanged, no extra routes triggered
Old b64 configs without customCatalogs Field defaults to [], no effect
tt / kitsu prefix in custom catalog Validator silently drops the entry
Addon URL / Meta unreachable Returns {"metas": []} / empty streams (logged warning, no crash)

Notes

IMDB ID parsing

resolve_custom_prefix_to_imdb attempts these paths in the addon's /meta response:

meta.imdbId  ·  meta.imdb  ·  meta.tt
meta.externalIds.imdb  ·  meta.externalIds.imdbId
meta.filmOverviewOut.imdbId
meta.filmOverviewOut.externalIds.imdb

If your addon stores the IMDB ID elsewhere, update the candidate list in resolve_custom_prefix_to_imdb in custom_catalog.py.

Catalog ID proxy convention

The proxy constructs the upstream catalog ID as {prefix}_{catalog_type} (e.g. csfd_movie). If your addon uses a different catalog ID scheme, the proxy URL will need adjustment.

Files Changed

File Type
comet/templates/index.html Modified
comet/core/models.py Modified
comet/api/endpoints/manifest.py Modified
comet/api/endpoints/custom_catalog.py New
comet/api/endpoints/stream.py Modified
comet/api/app.py Modified
… add new documentation files.

Summary by CodeRabbit

  • New Features

    • Custom Catalogs UI in Settings to add/configure/remove external catalogs (URL + ID prefix).
    • Public proxy endpoints to expose configured catalogs in app catalogs and generated manifest.
  • Improvements

    • Manifest now injects generated catalog entries and extra ID prefixes for custom catalogs.
    • Stream/metadata flow can resolve custom-prefix IDs to standard IDs and surface custom metadata.
    • Catalog proxy responses include permissive CORS.
  • Bug Fixes

    • Validation, safety checks (SSRF protections), index/config bounds checks, and safe fallbacks with improved logging.

@coderabbitai
Copy link

coderabbitai bot commented Mar 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a Custom Catalogs feature: new FastAPI proxy endpoints for user-configured external catalogs, manifest injection of custom catalog entries and idPrefixes, ConfigModel support and validation for customCatalogs, UI to manage catalogs, and stream flow integration to resolve custom-prefix IDs to IMDB or metadata.

Changes

Cohort / File(s) Summary
App routing
comet/api/app.py
Register new custom_catalog router into Stremio/public API routers.
Custom catalog API
comet/api/endpoints/custom_catalog.py
New APIRouter exposing /catalog and /catalog_with_extra proxy endpoints, SSRF/DNS safety checks, upstream JSON fetch helper, catalog-id parsing/validation, and exported resolve_custom_prefix_to_imdb.
Manifest generation
comet/api/endpoints/manifest.py
Added _build_custom_catalog_manifest and logic to inject generated custom catalog entries and extend idPrefixes in manifest when customCatalogs present.
Stream resolution
comet/api/endpoints/stream.py
Integrated resolve_custom_prefix_to_imdb into stream flow to translate custom-prefix IDs to IMDB or metadata-only (title/year), adjust metadata-fetch choice, and preserve early-fail responses.
Config models & validation
comet/core/models.py
Added BUILTIN_PREFIXES and customCatalogs: Optional[List[dict]] = [] on ConfigModel plus a @field_validator to sanitize entries and reject builtin prefixes.
Web UI
comet/templates/index.html
Added UI/JS to add/remove custom catalogs, persist customCatalogs in settings, and include them in manifest installation payload.

Sequence Diagram

sequenceDiagram
    autonumber
    participant User
    participant UI as Web UI
    participant Config as ConfigModel
    participant Manifest as Manifest Endpoint
    participant Stream as Stream Endpoint
    participant CustomAPI as Custom Catalog Endpoint
    participant Addon as External Addon

    User->>UI: Add custom catalog (URL, prefix)
    UI->>Config: Save customCatalogs
    User->>Manifest: Request manifest
    Manifest->>Config: Read customCatalogs
    Manifest->>Manifest: Build custom catalog entries & idPrefixes
    Manifest-->>User: Return manifest
    User->>Stream: Request stream for custom-prefixed media ID
    Stream->>Config: Read customCatalogs
    Stream->>CustomAPI: Resolve custom prefix -> IMDB or metadata
    CustomAPI->>Addon: Query addon's /meta endpoint
    Addon-->>CustomAPI: Return metadata
    CustomAPI-->>Stream: Return IMDB ID or title/year
    Stream->>Stream: Fetch metadata/streams using resolved info
    Stream-->>User: Return streams
Loading

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.81% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: Per-user custom catalog addon support' accurately and specifically summarizes the main change: adding per-user support for external Stremio-compatible catalog addons configurable from the /configure page.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (6)
comet/core/models.py (1)

1012-1034: LGTM! Robust validation for customCatalogs.

The validator properly sanitizes input by:

  • Handling edge cases (None, non-list, non-dict entries)
  • Normalizing URL (strip whitespace, remove trailing slash) and prefix (strip whitespace)
  • Blocking built-in prefixes ("tt", "kitsu")

Note: The built-in prefix check (prefix not in ("tt", "kitsu")) duplicates _BUILTIN_PREFIXES in manifest.py. Consider extracting to a shared constant for maintainability, though this is minor.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/core/models.py` around lines 1012 - 1034, The validator for
customCatalogs in validate_custom_catalogs currently hardcodes the built-in
prefixes tuple ("tt", "kitsu"), duplicating _BUILTIN_PREFIXES from manifest.py;
refactor to import and use the shared constant instead of the inline tuple:
replace the hardcoded check in validate_custom_catalogs with a reference to the
shared _BUILTIN_PREFIXES (or a new exported constant in manifest.py), update
imports accordingly, and ensure the behavior/validation remains identical for
the customCatalogs field and customCatalogs attribute.
comet/api/endpoints/stream.py (1)

478-485: Move import re to module level.

The re module is imported inline within the function body. This works but is unconventional and adds overhead on each invocation. Module-level imports are preferred for clarity and performance.

♻️ Move import to top of file

Add at the top of the file with other imports:

import re

Then remove the inline import:

                     # Parse year from "2026-2026" or "2026" or "2026-"
                     year_str = str(resolved_meta.get("year")
                                    or resolved_meta.get("releaseInfo") or "")
                     if year_str:
                         # try to get the first 4 digits
-                        import re
                         match = re.search(r'\d{4}', year_str)
                         if match:
                             custom_meta_year = int(match.group(0))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/stream.py` around lines 478 - 485, The inline import of
re inside the block that builds year_str should be moved to the module level for
performance and style: add import re alongside the other top-of-file imports,
then remove the inline "import re" from the code that computes
year_str/custom_meta_year (the block referencing year_str, resolved_meta, match,
and custom_meta_year) so the function uses the module-level re.
comet/api/endpoints/custom_catalog.py (3)

36-42: Simplify aiohttp.ClientTimeout usage.

Line 38 uses __import__("aiohttp").ClientTimeout(total=timeout) despite aiohttp already being imported at the top of the file. This is unnecessarily obscure.

♻️ Use direct import reference
         async with session.get(
             url,
-            timeout=__import__("aiohttp").ClientTimeout(total=timeout),
+            timeout=aiohttp.ClientTimeout(total=timeout),
             headers={"Accept": "application/json"},
         ) as resp:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` around lines 36 - 42, The timeout
construction uses an indirect import; replace the obscure
__import__("aiohttp").ClientTimeout(total=timeout) with the direct
aiohttp.ClientTimeout(total=timeout) where the session.get call is made (inside
the async with session.get(...) block in whatever function contains that call)
so it uses the already-imported aiohttp symbol.

46-54: Consider logging the exception in the inner try-except block.

The static analyzer flags the try-except-pass pattern at lines 49-50. While the error body logging is best-effort, silently swallowing the exception makes debugging harder. Consider logging at debug level.

🔧 Optional: Log exception at debug level
             try:
                 text = await resp.text()
                 logger.warning(f"Custom catalog err body: {text}")
-            except Exception:
-                pass
+            except Exception as e:
+                logger.debug(f"Custom catalog: could not read error body: {e}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` around lines 46 - 54, The inner
try-except that reads the response body swallows exceptions; update the except
block (the one around await resp.text() in custom_catalog.py) to log the caught
exception at debug level instead of passing — e.g., call logger.debug with a
message like "failed reading response body" and include exception info
(exc_info=True or use logger.exception) so debugging async response parsing
issues is possible while keeping the earlier logger.warning lines unchanged.

104-104: Minor: EN DASH characters in comments.

Static analysis flagged EN DASH () instead of HYPHEN-MINUS (-) at lines 104 and 136. These are cosmetic issues that may cause problems in some editors or linters.

Also applies to: 136-136

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` at line 104, Replace the EN DASH
character with a standard hyphen-minus in the inline comments that contain "Try
common locations for IMDB ID – adjust to actual API response structure" and the
other similar comment around line 136 in custom_catalog.py; search for the exact
comment text ("IMDB ID –") and update it to use "IMDB ID -" so editors/linters
see a normal hyphen-minus rather than an EN DASH.
comet/api/endpoints/manifest.py (1)

16-55: LGTM! Well-structured helper for manifest generation.

The logic correctly:

  • Skips entries with missing URL/prefix or built-in prefixes
  • Generates unique catalog entries per custom addon with proper naming convention
  • Deduplicates prefixes using seen_prefixes set

Minor nit from static analysis: lines 21-22 use EN DASH () instead of HYPHEN-MINUS (-) in the docstring, which can cause encoding issues in some editors.

🔧 Optional: Replace EN DASH with HYPHEN-MINUS in docstring
-    stremio_catalogs – catalog entries to include in the manifest
-    extra_id_prefixes – additional idPrefixes to advertise so Stremio sends
+    stremio_catalogs - catalog entries to include in the manifest
+    extra_id_prefixes - additional idPrefixes to advertise so Stremio sends
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/manifest.py` around lines 16 - 55, The docstring for
_build_custom_catalog_manifest contains EN DASH characters (e.g., in the phrases
"stremio_catalogs – catalog entries..." and "extra_id_prefixes – additional
idPrefixes..."); update those EN DASH characters to standard hyphen-minus (ASCII
'-') in the docstring so editors and tooling won't misinterpret the encoding,
leaving the rest of the function (stremio_catalogs, extra_prefixes,
seen_prefixes logic and return) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@comet/api/endpoints/custom_catalog.py`:
- Around line 36-42: The timeout construction uses an indirect import; replace
the obscure __import__("aiohttp").ClientTimeout(total=timeout) with the direct
aiohttp.ClientTimeout(total=timeout) where the session.get call is made (inside
the async with session.get(...) block in whatever function contains that call)
so it uses the already-imported aiohttp symbol.
- Around line 46-54: The inner try-except that reads the response body swallows
exceptions; update the except block (the one around await resp.text() in
custom_catalog.py) to log the caught exception at debug level instead of passing
— e.g., call logger.debug with a message like "failed reading response body" and
include exception info (exc_info=True or use logger.exception) so debugging
async response parsing issues is possible while keeping the earlier
logger.warning lines unchanged.
- Line 104: Replace the EN DASH character with a standard hyphen-minus in the
inline comments that contain "Try common locations for IMDB ID – adjust to
actual API response structure" and the other similar comment around line 136 in
custom_catalog.py; search for the exact comment text ("IMDB ID –") and update it
to use "IMDB ID -" so editors/linters see a normal hyphen-minus rather than an
EN DASH.

In `@comet/api/endpoints/manifest.py`:
- Around line 16-55: The docstring for _build_custom_catalog_manifest contains
EN DASH characters (e.g., in the phrases "stremio_catalogs – catalog entries..."
and "extra_id_prefixes – additional idPrefixes..."); update those EN DASH
characters to standard hyphen-minus (ASCII '-') in the docstring so editors and
tooling won't misinterpret the encoding, leaving the rest of the function
(stremio_catalogs, extra_prefixes, seen_prefixes logic and return) unchanged.

In `@comet/api/endpoints/stream.py`:
- Around line 478-485: The inline import of re inside the block that builds
year_str should be moved to the module level for performance and style: add
import re alongside the other top-of-file imports, then remove the inline
"import re" from the code that computes year_str/custom_meta_year (the block
referencing year_str, resolved_meta, match, and custom_meta_year) so the
function uses the module-level re.

In `@comet/core/models.py`:
- Around line 1012-1034: The validator for customCatalogs in
validate_custom_catalogs currently hardcodes the built-in prefixes tuple ("tt",
"kitsu"), duplicating _BUILTIN_PREFIXES from manifest.py; refactor to import and
use the shared constant instead of the inline tuple: replace the hardcoded check
in validate_custom_catalogs with a reference to the shared _BUILTIN_PREFIXES (or
a new exported constant in manifest.py), update imports accordingly, and ensure
the behavior/validation remains identical for the customCatalogs field and
customCatalogs attribute.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 63484ce and d9786f3.

📒 Files selected for processing (6)
  • comet/api/app.py
  • comet/api/endpoints/custom_catalog.py
  • comet/api/endpoints/manifest.py
  • comet/api/endpoints/stream.py
  • comet/core/models.py
  • comet/templates/index.html

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comet/api/endpoints/custom_catalog.py`:
- Around line 42-44: When parsing external JSON (the await resp.json(...) result
assigned to data), defensively validate that data is a mapping before using dict
methods like .get() or .keys(): check isinstance(data, dict) (or
collections.abc.Mapping), and if not, log a warning including the URL and
returned type, then return an empty dict or raise a controlled error instead of
proceeding; apply the same guard to the other parsing sites referenced around
the 102-114 and 199-200 ranges so all uses of data.get()/data.keys() are
protected.
- Around line 43-49: The logs currently print full upstream URLs and response
bodies (see logger.info, logger.warning with url and the resp.text() result),
which can leak secrets; update the error logging to redact or sanitize sensitive
data by not logging the full url or response body—log a sanitized identifier
(e.g., hostname or a hashed/trimmed version of url) and a truncated/sanitized
error summary instead, and include exception info via e or structured metadata
rather than the raw body; apply the same change to the other occurrences around
resp.status and the subsequent resp.text() handling so no full URLs or full
response bodies are emitted.
- Around line 135-136: The code parses idx = int(rest[:underscore_pos]) and then
uses idx to index customCatalogs but doesn't reject negative values, enabling
Python negative indexing (e.g., "cstm-1_...") which selects the wrong entry;
update the parsing/validation in the functions that compute idx (look for the
idx, rest, underscore_pos usage around the two spots in custom_catalog.py,
including the block at the shown lines and the one around 169-175) to check that
the parsed idx is >= 0 and within the bounds of customCatalogs before using it;
if the value is negative or out of range, return/raise an appropriate error
(e.g., HTTP 400 / ValueError) instead of proceeding to index the list.
- Around line 33-38: The _fetch_json function is making server-side requests to
user-controlled URLs, enabling SSRF; before calling session.get in _fetch_json
(and the other similar call sites), validate and reject targets that resolve to
private/internal addresses or localhost: perform DNS resolution of the hostname,
check the resolved IPv4/IPv6 addresses against private ranges (10.0.0.0/8,
172.16.0.0/12, 192.168.0.0/16), loopback (127.0.0.0/8, ::1), link-local
(169.254.0.0/16, fe80::/10), multicast, and unspecified (0.0.0.0/8, ::/128), and
also reject requests with hostnames that are raw IP literals in those ranges;
only allow requests that pass the network check (or match an explicit allowlist)
and return an error/None otherwise, doing this check synchronously before
calling session.get.

In `@comet/api/endpoints/stream.py`:
- Around line 466-473: When resolve_custom_prefix_to_imdb returns a resolved_id,
the code currently replaces media_id with the bare IMDb id and drops any
season/episode suffix (e.g., ":1:2"), which later skews parse_media_id() and
searches; fix by extracting the original suffix from the incoming media_id (the
part after the custom-prefix mapping, e.g., everything after the first colon(s)
like ":1:2") and append that suffix to resolved_id before assigning media_id.
Update the replacement logic in the block that calls
resolve_custom_prefix_to_imdb (and the similar spot around the second call) so
that when resolved_id is truthy you set media_id = f"{resolved_id}{suffix}"
(preserving resolved_meta behavior unchanged).

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d9786f3 and 9c238b6.

📒 Files selected for processing (4)
  • comet/api/endpoints/custom_catalog.py
  • comet/api/endpoints/manifest.py
  • comet/api/endpoints/stream.py
  • comet/core/models.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • comet/core/models.py

Comment on lines +466 to +473
resolved_id, resolved_meta = await resolve_custom_prefix_to_imdb(
media_type, media_id, custom_catalogs
)
if resolved_id:
logger.log(
"SCRAPER", f"Custom prefix: resolved {media_id} → {resolved_id}")
media_id = resolved_id
elif resolved_meta and resolved_meta.get("name"):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve series episode suffix when replacing custom IDs with IMDb IDs.

For inputs like prefix123:1:2, replacing media_id with bare tt... before parse_media_id() can lose season/episode and skew search/results.

🎯 Suggested fix
-    if custom_catalogs:
+    original_episode_suffix = ""
+    if media_type == "series" and ":" in media_id:
+        original_episode_suffix = media_id[media_id.find(":"):]
+
+    if custom_catalogs:
         # Check if media_id starts with any user-configured custom prefix
         for _entry in custom_catalogs:
             _prefix = (_entry.get("prefix") or "").strip()
             if _prefix and media_id.startswith(_prefix):
                 resolved_id, resolved_meta = await resolve_custom_prefix_to_imdb(
                     media_type, media_id, custom_catalogs
                 )
                 if resolved_id:
+                    if media_type == "series" and original_episode_suffix and ":" not in resolved_id:
+                        resolved_id = f"{resolved_id}{original_episode_suffix}"
                     logger.log(
                         "SCRAPER", f"Custom prefix: resolved {media_id} → {resolved_id}")
                     media_id = resolved_id

Also applies to: 492-493

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/stream.py` around lines 466 - 473, When
resolve_custom_prefix_to_imdb returns a resolved_id, the code currently replaces
media_id with the bare IMDb id and drops any season/episode suffix (e.g.,
":1:2"), which later skews parse_media_id() and searches; fix by extracting the
original suffix from the incoming media_id (the part after the custom-prefix
mapping, e.g., everything after the first colon(s) like ":1:2") and append that
suffix to resolved_id before assigning media_id. Update the replacement logic in
the block that calls resolve_custom_prefix_to_imdb (and the similar spot around
the second call) so that when resolved_id is truthy you set media_id =
f"{resolved_id}{suffix}" (preserving resolved_meta behavior unchanged).

… Stremio catalogs, ID resolution logic, and documentation, while preserving series suffixes during ID resolution.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
comet/api/endpoints/custom_catalog.py (2)

49-83: ⚠️ Potential issue | 🔴 Critical

SSRF validation still allows mixed-DNS bypasses by checking only one resolved address.

At Line [73], only the first getaddrinfo result is inspected. A hostname with both public and private answers can still slip through this guard path. Validate scheme and inspect all resolved addresses before allowing outbound requests.

🔒 Suggested hardening
 def _is_safe_url(url: str) -> bool:
@@
-        hostname = parsed.hostname
+        if parsed.scheme not in {"http", "https"}:
+            return False
+        hostname = parsed.hostname
         if not hostname:
             return False
+        port = parsed.port or (443 if parsed.scheme == "https" else 80)
@@
-        # Resolve hostname to IP and check
-        resolved_ip = socket.getaddrinfo(hostname, None)[0][4][0]
-        addr = ipaddress.ip_address(resolved_ip)
-        for net in _PRIVATE_NETWORKS:
-            if addr in net:
-                logger.warning(
-                    f"Custom catalog: SSRF block - {hostname!r} resolved to private range"
-                )
-                return False
-        if addr.is_multicast or addr.is_unspecified or addr.is_reserved:
-            return False
-        return True
+        # Resolve hostname to all IPs and require every resolved target to be public.
+        addrinfo = socket.getaddrinfo(hostname, port, type=socket.SOCK_STREAM)
+        for *_, sockaddr in addrinfo:
+            resolved_ip = sockaddr[0]
+            addr = ipaddress.ip_address(resolved_ip)
+            for net in _PRIVATE_NETWORKS:
+                if addr in net:
+                    logger.warning(
+                        f"Custom catalog: SSRF block - {hostname!r} resolved to private range"
+                    )
+                    return False
+            if addr.is_multicast or addr.is_unspecified or addr.is_reserved:
+                return False
+        return True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` around lines 49 - 83, The _is_safe_url
function currently inspects only the first address returned from
socket.getaddrinfo, allowing mixed-DNS hostnames to bypass SSRF checks; update
_is_safe_url to validate the URL scheme first (e.g., only http/https allowed)
and iterate over all addresses returned by socket.getaddrinfo(hostname, None)
(check each entry's [4][0] IP), converting each to ipaddress.ip_address and
rejecting if any address is in _PRIVATE_NETWORKS or has
is_multicast/is_unspecified/is_reserved set; keep the existing raw-IP literal
check path (ip_address(hostname)) but mirror the same reject logic when any
resolved address is unsafe and log a warning mentioning hostname when blocking.

200-209: ⚠️ Potential issue | 🟠 Major

Guard nested payload fields before chained .get() access.

Lines [205]-[208] assume externalIds / filmOverviewOut are dicts. Upstream malformed JSON can raise AttributeError here and fail the request path unexpectedly.

🛡️ Defensive parsing fix
-    # Try common locations for IMDB ID - adjust to actual API response structure
-    for candidate in [
-        meta.get("imdbId"),
-        meta.get("imdb"),
-        meta.get("tt"),
-        (meta.get("externalIds") or {}).get("imdb"),
-        (meta.get("externalIds") or {}).get("imdbId"),
-        (meta.get("filmOverviewOut") or {}).get("imdbId"),
-        ((meta.get("filmOverviewOut") or {}).get("externalIds") or {}).get("imdb"),
-    ]:
+    external_ids = meta.get("externalIds")
+    if not isinstance(external_ids, dict):
+        external_ids = {}
+
+    film_overview = meta.get("filmOverviewOut")
+    if not isinstance(film_overview, dict):
+        film_overview = {}
+
+    film_external_ids = film_overview.get("externalIds")
+    if not isinstance(film_external_ids, dict):
+        film_external_ids = {}
+
+    # Try common locations for IMDB ID - adjust to actual API response structure
+    for candidate in [
+        meta.get("imdbId"),
+        meta.get("imdb"),
+        meta.get("tt"),
+        external_ids.get("imdb"),
+        external_ids.get("imdbId"),
+        film_overview.get("imdbId"),
+        film_external_ids.get("imdb"),
+    ]:
         if candidate and str(candidate).startswith("tt"):
             return str(candidate), meta
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` around lines 200 - 209, The code
assumes `meta` and nested fields are dicts and can raise AttributeError when
calling chained .get(); update the candidate extraction around `meta` to
defensively parse: ensure `meta` is a dict (e.g., `if not isinstance(meta,
dict): meta = {}`), and replace chained lookups with safe-access calls or a
small helper like `safe_get(d, *keys)` that returns None if any step is missing
or not a dict; then build the candidates list using `meta.get("imdbId")`,
`meta.get("imdb")`, `meta.get("tt")`, `safe_get(meta, "externalIds", "imdb")`,
`safe_get(meta, "externalIds", "imdbId")`, `safe_get(meta, "filmOverviewOut",
"imdbId")`, `safe_get(meta, "filmOverviewOut", "externalIds", "imdb")` to avoid
AttributeError in the loop that iterates over `candidate`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comet/api/endpoints/custom_catalog.py`:
- Around line 258-311: Several early-return branches in the custom catalog
handler return JSONResponse({"metas": []}) without the CORS header, causing
inconsistent responses; update every early JSONResponse return (the ones after
_parse_catalog_id failing, after config_check failing, after index out-of-range,
after missing base_url/entry_prefix, and after prefix mismatch) to include
headers={"Access-Control-Allow-Origin":"*"} so they match the success/fallback
responses; search for the JSONResponse({"metas": []}) occurrences in this
function (around the parsed check, config = config_check(...), the idx bounds
check, the base_url/entry_prefix check, and the entry_prefix != prefix check)
and add the same headers argument to each.

---

Duplicate comments:
In `@comet/api/endpoints/custom_catalog.py`:
- Around line 49-83: The _is_safe_url function currently inspects only the first
address returned from socket.getaddrinfo, allowing mixed-DNS hostnames to bypass
SSRF checks; update _is_safe_url to validate the URL scheme first (e.g., only
http/https allowed) and iterate over all addresses returned by
socket.getaddrinfo(hostname, None) (check each entry's [4][0] IP), converting
each to ipaddress.ip_address and rejecting if any address is in
_PRIVATE_NETWORKS or has is_multicast/is_unspecified/is_reserved set; keep the
existing raw-IP literal check path (ip_address(hostname)) but mirror the same
reject logic when any resolved address is unsafe and log a warning mentioning
hostname when blocking.
- Around line 200-209: The code assumes `meta` and nested fields are dicts and
can raise AttributeError when calling chained .get(); update the candidate
extraction around `meta` to defensively parse: ensure `meta` is a dict (e.g.,
`if not isinstance(meta, dict): meta = {}`), and replace chained lookups with
safe-access calls or a small helper like `safe_get(d, *keys)` that returns None
if any step is missing or not a dict; then build the candidates list using
`meta.get("imdbId")`, `meta.get("imdb")`, `meta.get("tt")`, `safe_get(meta,
"externalIds", "imdb")`, `safe_get(meta, "externalIds", "imdbId")`,
`safe_get(meta, "filmOverviewOut", "imdbId")`, `safe_get(meta,
"filmOverviewOut", "externalIds", "imdb")` to avoid AttributeError in the loop
that iterates over `candidate`.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9c238b6 and e1f4003.

📒 Files selected for processing (2)
  • comet/api/endpoints/custom_catalog.py
  • comet/api/endpoints/stream.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • comet/api/endpoints/stream.py

…dpoints, UI, and enhanced URL safety validation.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
comet/api/endpoints/custom_catalog.py (3)

168-173: Type hint inconsistency: parameter can be None but typed as list.

Line 183 handles the None case with custom_catalogs or [], indicating the parameter can be None. The type hint should reflect this for accurate static analysis.

✨ Suggested fix
 async def resolve_custom_prefix_to_imdb(
     media_type: str,
     media_id: str,
-    custom_catalogs: list,
+    custom_catalogs: Optional[list],
     timeout: int = 15,
 ) -> tuple[Optional[str], Optional[dict]]:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` around lines 168 - 173, The parameter
custom_catalogs of resolve_custom_prefix_to_imdb can be None (the function uses
custom_catalogs or []), so update the type hint to reflect that (e.g.,
custom_catalogs: Optional[list] or custom_catalogs: list | None) and ensure any
required typing imports (Optional) are present; adjust any callers/tests if they
relied on the stricter list-only annotation.

259-266: Misleading comment: prefix CAN contain underscores.

The comment states "prefix itself may not contain underscores" but the code uses rfind("_") which specifically allows the prefix to contain underscores while keeping type as the rightmost segment.

📝 Suggested comment fix
         # remainder is "{prefix}_{type}" - split at the *last* underscore
-        # because prefix itself may not contain underscores and type is
-        # always the rightmost segment.
+        # because prefix MAY contain underscores, while type is
+        # always a simple identifier (movie/series) as the rightmost segment.
         last_underscore = remainder.rfind("_")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` around lines 259 - 266, Update the
misleading comment above the split logic to correctly state that the code splits
at the last underscore so the rightmost segment is the type while the prefix may
contain underscores; specifically adjust the comment near variables remainder,
last_underscore, prefix, and cat_type to remove "prefix itself may not contain
underscores" and clarify that rfind("_") is used so the prefix can include
underscores and only the final segment is treated as the type.

78-89: Make _is_safe_url() async to avoid blocking the event loop during DNS resolution.

socket.getaddrinfo() is synchronous and blocks the event loop when called from the async _fetch_json() function. This stalls all concurrent requests during DNS resolution.

The function should be converted to async and use the event loop's async DNS resolver:

♻️ Convert to async DNS resolution
-def _is_safe_url(url: str) -> bool:
+async def _is_safe_url(url: str) -> bool:
     """
     Return True if the URL resolves only to public, non-private addresses.
     ...
     """
     ...
-        all_addrs = socket.getaddrinfo(hostname, None)
+        loop = asyncio.get_running_loop()
+        all_addrs = await loop.getaddrinfo(hostname, None)

And update the caller:

-    if not _is_safe_url(url):
+    if not await _is_safe_url(url):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` around lines 78 - 89, Convert
_is_safe_url to an async function and replace the blocking socket.getaddrinfo
call with the event loop's async resolver (await
asyncio.get_running_loop().getaddrinfo(hostname, None)) to avoid blocking the
event loop; update any callers (notably _fetch_json) to await _is_safe_url and
propagate async changes, and add the necessary asyncio import. Ensure the same
logic for iterating all_addrs and calling _addr_is_blocked remains intact after
the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@comet/api/endpoints/custom_catalog.py`:
- Around line 168-173: The parameter custom_catalogs of
resolve_custom_prefix_to_imdb can be None (the function uses custom_catalogs or
[]), so update the type hint to reflect that (e.g., custom_catalogs:
Optional[list] or custom_catalogs: list | None) and ensure any required typing
imports (Optional) are present; adjust any callers/tests if they relied on the
stricter list-only annotation.
- Around line 259-266: Update the misleading comment above the split logic to
correctly state that the code splits at the last underscore so the rightmost
segment is the type while the prefix may contain underscores; specifically
adjust the comment near variables remainder, last_underscore, prefix, and
cat_type to remove "prefix itself may not contain underscores" and clarify that
rfind("_") is used so the prefix can include underscores and only the final
segment is treated as the type.
- Around line 78-89: Convert _is_safe_url to an async function and replace the
blocking socket.getaddrinfo call with the event loop's async resolver (await
asyncio.get_running_loop().getaddrinfo(hostname, None)) to avoid blocking the
event loop; update any callers (notably _fetch_json) to await _is_safe_url and
propagate async changes, and add the necessary asyncio import. Ensure the same
logic for iterating all_addrs and calling _addr_is_blocked remains intact after
the change.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1f4003 and 03e3088.

📒 Files selected for processing (1)
  • comet/api/endpoints/custom_catalog.py

…atalog integration and pull request documentation.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
comet/api/endpoints/custom_catalog.py (2)

184-189: Prefix matching uses first-match semantics; consider longest-match.

The current loop breaks on the first matching prefix via startswith(). If a user configures overlapping prefixes (e.g., csfd and csfdb), a media ID like csfdb12345 would match whichever appears first in the list, potentially routing to the wrong addon.

Consider sorting by prefix length descending or using longest-match semantics for more predictable behavior.

♻️ Optional fix for longest-match semantics
     matched_url: Optional[str] = None
+    matched_prefix_len = 0
     for entry in custom_catalogs or []:
         prefix = (entry.get("prefix") or "").strip()
         url = (entry.get("url") or "").strip().rstrip("/")
-        if prefix and url and media_id.startswith(prefix):
+        if prefix and url and media_id.startswith(prefix) and len(prefix) > matched_prefix_len:
             matched_url = url
-            break
+            matched_prefix_len = len(prefix)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` around lines 184 - 189, The loop over
custom_catalogs uses first-match semantics causing incorrect routing for
overlapping prefixes; update the logic that finds matched_url to use
longest-match semantics by selecting the entry with the longest non-empty prefix
that satisfies media_id.startswith(prefix) (e.g., inspect custom_catalogs and
pick the match with the max len(prefix) rather than breaking on the first
match). Ensure you still strip/normalize prefix and url (the code around prefix
= (entry.get("prefix") or "").strip() and url = (entry.get("url") or
"").strip().rstrip("/")) and set matched_url to the longest matching entry’s
url; keep behavior when custom_catalogs is empty unchanged.

284-284: Unused variable _declared_type extracted from catalog ID.

The _declared_type is parsed from the catalog ID but never used. The code uses catalog_type from the URL path instead, which is likely intentional for security (trusting the path over embedded data).

If intentional, the underscore prefix convention is correct. If not, consider adding a consistency check:

♻️ Optional: Add consistency check between path and catalog ID type
     idx, prefix, _declared_type = parsed
+    
+    # Optionally verify catalog_type matches what's declared in the ID
+    if _declared_type != catalog_type:
+        logger.warning(
+            f"Custom catalog: type mismatch: path has {catalog_type!r}, "
+            f"catalog_id declares {_declared_type!r}"
+        )
+        return JSONResponse({"metas": []}, headers={"Access-Control-Allow-Origin": "*"})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comet/api/endpoints/custom_catalog.py` at line 284, The parsed tuple unpacks
to idx, prefix, _declared_type but _declared_type is unused; if the intention is
to ignore the ID-embedded type then keep the underscore name but add a brief
comment clarifying that the URL path's catalog_type is authoritative, otherwise
add a consistency check that compares _declared_type to catalog_type (from the
request path) and raise or log an error when they differ to prevent mismatched
types; locate the unpacking (parsed -> idx, prefix, _declared_type) and
implement the chosen approach (comment-or-compare) in the same function handling
catalog parsing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@comet/api/endpoints/custom_catalog.py`:
- Around line 184-189: The loop over custom_catalogs uses first-match semantics
causing incorrect routing for overlapping prefixes; update the logic that finds
matched_url to use longest-match semantics by selecting the entry with the
longest non-empty prefix that satisfies media_id.startswith(prefix) (e.g.,
inspect custom_catalogs and pick the match with the max len(prefix) rather than
breaking on the first match). Ensure you still strip/normalize prefix and url
(the code around prefix = (entry.get("prefix") or "").strip() and url =
(entry.get("url") or "").strip().rstrip("/")) and set matched_url to the longest
matching entry’s url; keep behavior when custom_catalogs is empty unchanged.
- Line 284: The parsed tuple unpacks to idx, prefix, _declared_type but
_declared_type is unused; if the intention is to ignore the ID-embedded type
then keep the underscore name but add a brief comment clarifying that the URL
path's catalog_type is authoritative, otherwise add a consistency check that
compares _declared_type to catalog_type (from the request path) and raise or log
an error when they differ to prevent mismatched types; locate the unpacking
(parsed -> idx, prefix, _declared_type) and implement the chosen approach
(comment-or-compare) in the same function handling catalog parsing.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 03e3088 and 0e65681.

📒 Files selected for processing (1)
  • comet/api/endpoints/custom_catalog.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant