Skip to content

Add Docker Compose with Ollama (CPU and GPU) support for LangChain integration #10

Description

@ericksonlopes

Problem

Currently, the project lacks a standard way to run Ollama via Docker Compose, making it difficult to reproduce locally examples that use ChatOllama with base_url="http://localhost:11434" for LangChain integration.

Example Usage

Here's the expected workflow:

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama

# docker exec -it ollama ollama pull llama3.2
# llama3.2:latest

llm = ChatOllama(model="llama3.2", base_url="http://localhost:11434")
prompt = ChatPromptTemplate.from_template("Question: {input}\nAnswer in English:")
chain = prompt | llm | StrOutputParser()

print(chain.invoke({"input": "What are the advantages of GPU in Ollama?"}))

Proposed Solution

1. Create docker-compose.cpu.yml

A lightweight configuration for CPU-only execution:

services:
  ollama:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    restart: unless-stopped

volumes:
  ollama_data:

2. Create docker-compose.gpu.yml

GPU-accelerated configuration for NVIDIA GPUs:

services:
  ollama:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
    restart: unless-stopped

volumes:
  ollama_data:

3. Update README.md

Add a new section explaining how to:

CPU Setup:

docker compose -f docker-compose.cpu.yml up -d
docker exec -it ollama ollama pull llama3.2

GPU Setup:

docker compose -f docker-compose.gpu.yml up -d
docker exec -it ollama ollama pull llama3.2

Note: Requires NVIDIA Container Toolkit to be configured on the host.

Benefits

  • ✅ Streamlines onboarding for local development
  • ✅ Ensures consistent environment for Ollama + LangChain integration
  • ✅ Leverages GPU when available for improved latency and throughput
  • ✅ Simplified model management via docker exec commands
  • ✅ Persistent model storage with Docker volumes

Prerequisites for GPU

  • NVIDIA GPU
  • NVIDIA Container Toolkit installed and configured
  • Docker with proper Nvidia runtime support

Additional Notes

  • Models will be persisted in the ollama_data volume
  • The Ollama API will be accessible at http://localhost:11434
  • Multiple models can be pulled and managed independently

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or requestlangchainIssues related to LangChain framework integration and usagellmLarge Language Model related features and implementations

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions