Since we cannot use Docker containers, we will build a robust "Native" local development environment. This matches a production architecture logic-wise but uses tools compatible with Windows and local python processes.
| Component | Production (Cloud) | Local Native (No Docker) | Library / Tool |
|---|---|---|---|
| Vector DB | Pincone / Chroma Server | ChromaDB (Persistent) | pip install chromadb |
| Relational DB | PostgreSQL (Supabase) | PostgreSQL (Windows) or SQLite | pip install sqlalchemy |
| Queue/Cache | Redis (Upstash) | Memurai (Redis for Windows) or In-Process | pip install redis celery |
- Vector Database: We will replace
.pklfiles with ChromaDB. In local mode, Chroma saves to a./chroma_dbfolder but behaves exactly like a server. This solves the "scale" and "search" issues. - Relational Database: We will use SQLAlchemy (ORM).
- Recommendation: Start with SQLite (
filmsuma.db) for local dev. It requires zero installation and works perfectly with SQLAlchemy. When you go to production, you just change the connection string to PostgreSQL.
- Recommendation: Start with SQLite (
- Task Queue: Running Redis on Windows natively is tricky.
- Option A (Recommended for simplicity): Stick to
FastAPI.BackgroundTasksfor now. It works well for single-server setups. - Option B (True Production Sim): Install Memurai (Redis for Windows) and set up Celery. This is complex to configure on Windows.
- Decision: We will implement ChromaDB and SQLAlchemy (SQLite) first. We will keep
BackgroundTasksbut refactor the code so it's ready for Celery later.
- Option A (Recommended for simplicity): Stick to
Instead of saving movie_name.pkl, we will upsert vectors into a Chroma collection.
- Action: Install
chromadb. - Action: Rewrite
src/core/embeddings.pyto initialize a Chroma Client. - Action: Create
add_movie_vectorsandquery_similar_dialoguefunctions.
We need to track if a movie is "processed", "pending", or "failed".
- Action: Install
sqlalchemy. - Action: Create
src/db/database.pyfor connection logic. - Action: Create
src/models/sql_models.py(Movie, JobStatus tables). - Action: Update endpoints to check this DB before starting work.
I will start by installing the necessary libraries and refactoring the Embeddings system to use ChromaDB. This yields the biggest immediate benefit (searchability).