-
Notifications
You must be signed in to change notification settings - Fork 0
Features: #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Features: #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 issues found across 16 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="app/audio.py">
<violation number="1" location="app/audio.py:17">
P1: If model loading fails, `WHISPER` and `EMBEDDER` will be undefined, causing `NameError` when `transcribe_chunk()` is called. Consider either re-raising the exception to fail fast, or setting the variables to `None` and checking before use.</violation>
</file>
<file name="app/server.py">
<violation number="1" location="app/server.py:53">
P1: Missing `qdrant = None` in exception handler. If Qdrant connection fails, `qdrant` will be undefined, causing a `NameError` when endpoints try to use it. This differs from the Kafka producer pattern above which correctly sets `producer = None` on failure.</violation>
</file>
<file name="Dockerfile">
<violation number="1" location="Dockerfile:13">
P2: Pin the `uv` image to a specific version instead of using `:latest` for reproducible and secure builds.</violation>
<violation number="2" location="Dockerfile:23">
P1: Builds are not reproducible without a lock file. Consider keeping `uv.lock` committed and using `uv sync --frozen` to ensure consistent dependency versions across builds.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| # Use 'tiny' or 'base' for speed during debugging | ||
| WHISPER = WhisperModel("base", device="cpu", compute_type="int8") | ||
| EMBEDDER = TextEmbedding(model_name="BAAI/bge-small-en-v1.5") | ||
| except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: If model loading fails, WHISPER and EMBEDDER will be undefined, causing NameError when transcribe_chunk() is called. Consider either re-raising the exception to fail fast, or setting the variables to None and checking before use.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/audio.py, line 17:
<comment>If model loading fails, `WHISPER` and `EMBEDDER` will be undefined, causing `NameError` when `transcribe_chunk()` is called. Consider either re-raising the exception to fail fast, or setting the variables to `None` and checking before use.</comment>
<file context>
@@ -0,0 +1,80 @@
+ # Use 'tiny' or 'base' for speed during debugging
+ WHISPER = WhisperModel("base", device="cpu", compute_type="int8")
+ EMBEDDER = TextEmbedding(model_name="BAAI/bge-small-en-v1.5")
+except Exception as e:
+ logger.critical(f"Failed to load Audio models: {e}")
+
</file context>
| qdrant = QdrantClient(host=QDRANT_HOST, port=6333) | ||
| logger.info(f"Qdrant connected at {QDRANT_HOST}") | ||
| except Exception as e: | ||
| logger.critical(f"Qdrant Failed: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Missing qdrant = None in exception handler. If Qdrant connection fails, qdrant will be undefined, causing a NameError when endpoints try to use it. This differs from the Kafka producer pattern above which correctly sets producer = None on failure.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/server.py, line 53:
<comment>Missing `qdrant = None` in exception handler. If Qdrant connection fails, `qdrant` will be undefined, causing a `NameError` when endpoints try to use it. This differs from the Kafka producer pattern above which correctly sets `producer = None` on failure.</comment>
<file context>
@@ -2,307 +2,184 @@
+ qdrant = QdrantClient(host=QDRANT_HOST, port=6333)
+ logger.info(f"Qdrant connected at {QDRANT_HOST}")
+except Exception as e:
+ logger.critical(f"Qdrant Failed: {e}")
+
+redis_client = redis.Redis(host=REDIS_HOST, port=6379, decode_responses=True)
</file context>
| logger.critical(f"Qdrant Failed: {e}") | |
| logger.critical(f"Qdrant Failed: {e}") | |
| qdrant = None |
| # Copy requirements first to leverage Docker caching | ||
| COPY requirements.txt . | ||
| # 2. Install uv | ||
| COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Pin the uv image to a specific version instead of using :latest for reproducible and secure builds.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At Dockerfile, line 13:
<comment>Pin the `uv` image to a specific version instead of using `:latest` for reproducible and secure builds.</comment>
<file context>
@@ -1,25 +1,32 @@
-# Copy requirements first to leverage Docker caching
-COPY requirements.txt .
+# 2. Install uv
+COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
+
+# 3. Set up the application
</file context>
| COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv | |
| COPY --from=ghcr.io/astral-sh/uv:0.5.11 /uv /bin/uv |
| COPY pyproject.toml ./ | ||
|
|
||
| # Run sync WITHOUT --frozen (this resolves dependencies fresh) | ||
| RUN uv sync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Builds are not reproducible without a lock file. Consider keeping uv.lock committed and using uv sync --frozen to ensure consistent dependency versions across builds.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At Dockerfile, line 23:
<comment>Builds are not reproducible without a lock file. Consider keeping `uv.lock` committed and using `uv sync --frozen` to ensure consistent dependency versions across builds.</comment>
<file context>
@@ -1,25 +1,32 @@
+COPY pyproject.toml ./
+
+# Run sync WITHOUT --frozen (this resolves dependencies fresh)
+RUN uv sync
-# Install Python dependencies
</file context>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 issues found across 12 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="app/manager.py">
<violation number="1" location="app/manager.py:94">
P0: The Redis key `job:{job_id}:pending for the user {user_id}` contains spaces and appears to be a typo where descriptive text was accidentally included. This key format doesn't match the key used in `minion.py` (`job:{job_id}:pending`), which will cause job completion tracking to break entirely - chunks will never be marked as processed.</violation>
</file>
<file name="app/minion.py">
<violation number="1" location="app/minion.py:58">
P2: Exception is silently swallowed without logging. This will make debugging embedding API failures very difficult in production. Consider logging the error before returning None.</violation>
<violation number="2" location="app/minion.py:58">
P1: Silently ignoring Redis exceptions with `pass` hides critical job completion tracking failures. If Redis is down or the key doesn't exist, you'll have no visibility into why jobs appear incomplete.</violation>
</file>
<file name="app/database.py">
<violation number="1" location="app/database.py:66">
P2: Exception is caught and printed but not re-raised. This silently swallows database errors, preventing callers from knowing if the operation succeeded. Consider either re-raising the exception or using proper logging with error propagation.</violation>
</file>
Reply to cubic to teach it or ask questions. Tag @cubic-dev-ai to re-run a review.
| except Exception: | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Exception is silently swallowed without logging. This will make debugging embedding API failures very difficult in production. Consider logging the error before returning None.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/minion.py, line 58:
<comment>Exception is silently swallowed without logging. This will make debugging embedding API failures very difficult in production. Consider logging the error before returning None.</comment>
<file context>
@@ -48,78 +44,65 @@
- except Exception as e:
- logger.error(f"Embedding API Error: {e}")
+ return res.json()["vector"] if res.status_code == 200 else None
+ except Exception:
return None
</file context>
| except Exception: | |
| return None | |
| except Exception as e: | |
| logger.error(f"Embedding API Error: {e}") | |
| return None |
| video.status = status | ||
| db.commit() | ||
| except Exception as e: | ||
| print(f"DB Error: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Exception is caught and printed but not re-raised. This silently swallows database errors, preventing callers from knowing if the operation succeeded. Consider either re-raising the exception or using proper logging with error propagation.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/database.py, line 66:
<comment>Exception is caught and printed but not re-raised. This silently swallows database errors, preventing callers from knowing if the operation succeeded. Consider either re-raising the exception or using proper logging with error propagation.</comment>
<file context>
@@ -1,60 +1,108 @@
+ video.status = status
+ db.commit()
+ except Exception as e:
+ print(f"DB Error: {e}")
+ finally:
+ db.close()
</file context>
| print(f"DB Error: {e}") | |
| print(f"DB Error: {e}") | |
| raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 issues found across 14 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="app/utils/scene_detect.py">
<violation number="1" location="app/utils/scene_detect.py:2">
P2: Typo in deprecation notice: 'StructualSceneDetector' should be 'StructuralSceneDetector' (missing 'r' in 'Structural').</violation>
</file>
<file name="app/worker.py">
<violation number="1" location="app/worker.py:131">
P1: Job will never complete if any chunk fails. When `process_chunk` raises an exception, the reduce step that decrements `job:{job_id}:pending` is skipped. The error handler in `start()` commits the message but doesn't decrement the counter, so jobs with failed chunks hang indefinitely. Consider moving the reduce step to a `finally` block, or decrementing the counter in the error handling path.</violation>
</file>
<file name="app/server.py">
<violation number="1" location="app/server.py:116">
P1: Exposing raw exception details in HTTP responses is a security risk. The full exception message could leak sensitive database information (table/column names, SQL errors, connection strings). The error is already logged for debugging - return a generic message to clients instead.</violation>
</file>
Reply to cubic to teach it or ask questions. Tag @cubic-dev-ai to re-run a review.
| @@ -1,3 +1,7 @@ | |||
| """ | |||
| Deprecated: Use StructualSceneDetector instead. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Typo in deprecation notice: 'StructualSceneDetector' should be 'StructuralSceneDetector' (missing 'r' in 'Structural').
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/utils/scene_detect.py, line 2:
<comment>Typo in deprecation notice: 'StructualSceneDetector' should be 'StructuralSceneDetector' (missing 'r' in 'Structural').</comment>
<file context>
@@ -1,3 +1,7 @@
+"""
+Deprecated: Use StructualSceneDetector instead.
+"""
+
</file context>
| Deprecated: Use StructualSceneDetector instead. | |
| Deprecated: Use StructuralSceneDetector instead. |
|
|
||
| except Exception as e: | ||
| logger.error(f"Visual processing failed: {e}") | ||
| raise e # Re-raise to signal failure to the thread wrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Job will never complete if any chunk fails. When process_chunk raises an exception, the reduce step that decrements job:{job_id}:pending is skipped. The error handler in start() commits the message but doesn't decrement the counter, so jobs with failed chunks hang indefinitely. Consider moving the reduce step to a finally block, or decrementing the counter in the error handling path.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/worker.py, line 131:
<comment>Job will never complete if any chunk fails. When `process_chunk` raises an exception, the reduce step that decrements `job:{job_id}:pending` is skipped. The error handler in `start()` commits the message but doesn't decrement the counter, so jobs with failed chunks hang indefinitely. Consider moving the reduce step to a `finally` block, or decrementing the counter in the error handling path.</comment>
<file context>
@@ -131,8 +128,9 @@ def process_chunk(task):
except Exception as e:
logger.error(f"Visual processing failed: {e}")
+ raise e # Re-raise to signal failure to the thread wrapper
- # 3. AUDIO PROCESSING (Standard)
</file context>
| return {"error": str(e)} | ||
| db.rollback() | ||
| logger.error(f"Database error: {e}") | ||
| raise HTTPException(status_code=500, detail=f"Database error: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Exposing raw exception details in HTTP responses is a security risk. The full exception message could leak sensitive database information (table/column names, SQL errors, connection strings). The error is already logged for debugging - return a generic message to clients instead.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/server.py, line 116:
<comment>Exposing raw exception details in HTTP responses is a security risk. The full exception message could leak sensitive database information (table/column names, SQL errors, connection strings). The error is already logged for debugging - return a generic message to clients instead.</comment>
<file context>
@@ -95,8 +98,24 @@ async def queue_video(
+ except Exception as e:
+ db.rollback()
+ logger.error(f"Database error: {e}")
+ raise HTTPException(status_code=500, detail=f"Database error: {e}")
+ finally:
+ db.close()
</file context>
| raise HTTPException(status_code=500, detail=f"Database error: {e}") | |
| raise HTTPException(status_code=500, detail="Failed to process video request") |
Summary by cubic
Add audio transcription and search, and dockerize the API, orchestrator, and worker with a shared /app/assets volume. Migrated to uv and Python 3.12; added a Postgres-backed video library with URL-hash dedupe and per-user linking.
New Features
Refactors
Written for commit 5acf1b2. Summary will update automatically on new commits.