Problem
Currently, the project lacks a standard way to run Ollama via Docker Compose, making it difficult to reproduce locally examples that use ChatOllama with base_url="http://localhost:11434" for LangChain integration.
Example Usage
Here's the expected workflow:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama
# docker exec -it ollama ollama pull llama3.2
# llama3.2:latest
llm = ChatOllama(model="llama3.2", base_url="http://localhost:11434")
prompt = ChatPromptTemplate.from_template("Question: {input}\nAnswer in English:")
chain = prompt | llm | StrOutputParser()
print(chain.invoke({"input": "What are the advantages of GPU in Ollama?"}))
Proposed Solution
1. Create docker-compose.cpu.yml
A lightweight configuration for CPU-only execution:
services:
ollama:
image: ollama/ollama:latest
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
restart: unless-stopped
volumes:
ollama_data:
2. Create docker-compose.gpu.yml
GPU-accelerated configuration for NVIDIA GPUs:
services:
ollama:
image: ollama/ollama:latest
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
restart: unless-stopped
volumes:
ollama_data:
3. Update README.md
Add a new section explaining how to:
CPU Setup:
docker compose -f docker-compose.cpu.yml up -d
docker exec -it ollama ollama pull llama3.2
GPU Setup:
docker compose -f docker-compose.gpu.yml up -d
docker exec -it ollama ollama pull llama3.2
Note: Requires NVIDIA Container Toolkit to be configured on the host.
Benefits
- ✅ Streamlines onboarding for local development
- ✅ Ensures consistent environment for Ollama + LangChain integration
- ✅ Leverages GPU when available for improved latency and throughput
- ✅ Simplified model management via docker exec commands
- ✅ Persistent model storage with Docker volumes
Prerequisites for GPU
- NVIDIA GPU
- NVIDIA Container Toolkit installed and configured
- Docker with proper Nvidia runtime support
Additional Notes
- Models will be persisted in the
ollama_data volume
- The Ollama API will be accessible at
http://localhost:11434
- Multiple models can be pulled and managed independently
Problem
Currently, the project lacks a standard way to run Ollama via Docker Compose, making it difficult to reproduce locally examples that use
ChatOllamawithbase_url="http://localhost:11434"for LangChain integration.Example Usage
Here's the expected workflow:
Proposed Solution
1. Create
docker-compose.cpu.ymlA lightweight configuration for CPU-only execution:
2. Create
docker-compose.gpu.ymlGPU-accelerated configuration for NVIDIA GPUs:
3. Update README.md
Add a new section explaining how to:
CPU Setup:
docker compose -f docker-compose.cpu.yml up -d docker exec -it ollama ollama pull llama3.2GPU Setup:
docker compose -f docker-compose.gpu.yml up -d docker exec -it ollama ollama pull llama3.2Note: Requires NVIDIA Container Toolkit to be configured on the host.
Benefits
Prerequisites for GPU
Additional Notes
ollama_datavolumehttp://localhost:11434