Skip to content

Add MongoDB database integration #13

Description

@ericksonlopes

Objective

Integrate MongoDB into the WhatYouSaid project as the primary document database for storing and managing application data with flexible schema design.

Description

Add MongoDB support with a robust connection management system, enabling:

  • Document storage with flexible schema
  • Collection and database management
  • Context manager support for safe connection handling
  • Connection pooling and error handling
  • Support for both standalone and replica set configurations

Tasks

1. Docker Configuration

Add MongoDB service to docker-compose.yml:

mongodb:
  image: mongo:latest
  container_name: mongodb
  restart: unless-stopped
  ports:
    - "27017:27017"
  volumes:
    - mongodb_data:/data/db
  environment:
    MONGO_INITDB_ROOT_USERNAME: root
    MONGO_INITDB_ROOT_PASSWORD: example

Ensure mongodb_data volume is declared in the compose file.

2. MongoDB Connector Repository

Create src/infrastructure/repositories/mongodb/mongo_connector.py with flexible connection patterns:

from pymongo import MongoClient
from pymongo.collection import Collection
from pymongo.database import Database

from src.config.logger import Logger

logger = Logger()


class MongoConnector:
    def __init__(self, uri: str, db_name: str, collection_name: str | None = None):
        super().__init__()
        self.uri = uri
        self.db_name = db_name
        self.collection_name = collection_name
        self._client = None
        self._db = None

    def _create_client(self) -> MongoClient:
        try:
            client = MongoClient(self.uri)
            # Test connection
            client.admin.command('ping')
            return client
        except Exception as e:
            logger.error(f"Error creating MongoDB connection: {e}")
            raise

    def get_collection(self, collection_name: str = None) -> Collection:
        """Creates a new connection and returns the specified collection."""
        client = self._create_client()
        db = client[self.db_name]

        target_collection = collection_name or self.collection_name
        if not target_collection:
            client.close()
            raise ValueError("collection_name must be provided")

        return db[target_collection]

    def get_database(self) -> Database:
        """Creates a new connection and returns the database."""
        client = self._create_client()
        return client[self.db_name]

    def __enter__(self):
        """Context manager entry - creates a new connection and returns collection or database."""
        self._client = self._create_client()
        self._db = self._client[self.db_name]

        if self.collection_name:
            return self._db[self.collection_name]
        return self._db

    def __exit__(self, exc_type, exc_val, exc_tb):
        """Context manager exit - closes the created connection."""
        if hasattr(self, '_client') and self._client is not None:
            try:
                self._client.close()
            except Exception as e:
                logger.warning(f"Error closing MongoDB connection: {e}")
            finally:
                self._client = None
                self._db = None

        # Log only if there was an error during the operation
        if exc_type is not None:
            logger.error(f"Error during MongoDB operation: {exc_val}")

3. Configuration and Environment

Update src/config/settings.py to include MongoDB settings:

MONGO_URI: str = Field(default="mongodb://root:example@localhost:27017/")
MONGO_DATABASE: str = Field(default="whatyousaid")
MONGO_CONNECTION_TIMEOUT: int = Field(default=5000)
MONGO_MAX_POOL_SIZE: int = Field(default=10)

4. Dependencies

Add to requirements.txt:

pymongo>=4.6.0

5. Usage Examples

Create documentation with usage examples:

Using context manager with collection:

from src.infrastructure.repositories.mongodb.mongo_connector import MongoConnector
from src.config.settings import settings

# Initialize connector
connector = MongoConnector(
    uri=settings.MONGO_URI,
    db_name=settings.MONGO_DATABASE,
    collection_name="videos"
)

# Use as context manager
with connector as collection:
    # Insert document
    result = collection.insert_one({"title": "Example Video", "duration": 120})
    
    # Find documents
    videos = collection.find({"duration": {"$gt": 60}})
    for video in videos:
        print(video)

Using context manager with database:

connector = MongoConnector(
    uri=settings.MONGO_URI,
    db_name=settings.MONGO_DATABASE
)

with connector as db:
    # Access multiple collections
    videos = db["videos"].find_one({"_id": video_id})
    metadata = db["metadata"].find_one({"video_id": video_id})

Direct collection access:

connector = MongoConnector(
    uri=settings.MONGO_URI,
    db_name=settings.MONGO_DATABASE
)

collection = connector.get_collection("videos")
# Use collection (remember to close client manually)

Acceptance Criteria

  • MongoDB service runs successfully with docker-compose up
  • MongoConnector can establish connection to local MongoDB instance
  • Context manager properly manages connection lifecycle (open/close)
  • Connection health check (ping) works on initialization
  • Both collection and database access patterns work correctly
  • Error handling and logging implemented for connection failures
  • Environment configuration is properly set up
  • Unit tests cover connection establishment, context manager, and error scenarios
  • Documentation includes usage examples and best practices
  • Dependencies properly pinned in requirements.txt

Related

This is part of the database infrastructure for storing video metadata, transcriptions, and application data in WhatYouSaid.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions