docs(roadmap): video content vectorization brainstorm and roadmap by Kneesal · Pull Request #638 · JesusFilm/forge

Kneesal · 2026-04-02T02:16:50Z

Summary

Adds the complete planning artifacts for scene-level video content vectorization — enabling cross-film recommendations based on what's shown, not just what's said or tagged.

Requirements doc with phased rollout (English-first prototype at ~$100-300, full catalog as Phase 2 funding decision), pgvector storage schema, cost model, and technology research validating our approach against Spotify's RecSys 2025 published pattern
10 roadmap tickets (feat-037 through feat-046) breaking the work into: data audit → scene boundaries → multimodal descriptions → embeddings table → backfill worker → visual fusion (P2) → recommendation API → pipeline integration → demo experience frontend
feat-009 updated to reflect feat-037 as a downstream dependency

Architecture: Gemini 2.5 Flash describes scenes from Mux thumbnail frames + transcript, descriptions embedded via existing text-embedding-3-small pipeline into a separate scene_embeddings pgvector table with HNSW index.

Roadmap Tickets

ID	Feature	Days	Start
feat-037	Parent: Video Content Vectorization	42	Apr 21
feat-038	Data Audit	3	Apr 21
feat-039	Chapter-Based Scene Boundaries	7	Apr 24
feat-040	Multimodal Scene Descriptions	10	May 1
feat-041	Scene Embeddings Table + Indexing	7	May 11
feat-042	English Backfill Worker	10	May 18
feat-043	Visual Shot Detection Fusion (P2)	10	May 28
feat-044	Recommendation Query API	7	May 28
feat-045	Pipeline Integration	7	Jun 4
feat-046	Recommendations Demo Experience	7	Jun 4

🤖 Generated with Claude Code

… tickets Scene-level video embeddings for cross-film recommendations, starting with English-only prototype. Uses Gemini 2.5 Flash to describe scenes from extracted frames + transcript, then embeds descriptions via existing text-embedding-3-small pipeline into a separate pgvector scene_embeddings table. Adds: - Requirements doc with phased rollout, storage schema, cost model, and technology research (Spotify RecSys 2025 validates this approach) - Parent feature feat-037 plus 9 sub-tickets (feat-038 through feat-046) covering data audit, scene boundaries, descriptions, embeddings table, backfill worker, visual fusion, recommendation API, pipeline integration, and demo experience frontend - Updates feat-009 blocks to include feat-037 dependency Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

railway-app · 2026-04-02T02:16:58Z

🚅 Deployed to the forge-pr-638 environment in forge

4 services not affected by this PR

@forge/web
@forge/cms/db
@forge/cms
@forge/manager

Kneesal merged commit 5b36e43 into main Apr 2, 2026
13 checks passed

Kneesal deleted the docs/video-content-vectorization-roadmap branch April 2, 2026 02:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(roadmap): video content vectorization brainstorm and roadmap#638

docs(roadmap): video content vectorization brainstorm and roadmap#638
Kneesal merged 1 commit intomainfrom
docs/video-content-vectorization-roadmap

Kneesal commented Apr 2, 2026

Uh oh!

railway-app bot commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kneesal commented Apr 2, 2026

Summary

Roadmap Tickets

Uh oh!

railway-app bot commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant