Agent AI tư vấn đầu tư chứng khoán Việt Nam - Hệ thống phân tích thị trường chứng khoán thông minh sử dụng LangGraph, VnStock API và LLM.
- Quick Start
- System Requirements
- Installation Guide
- Database Setup
- Environment Configuration
- Running the Application
- Features
- Architecture
- Troubleshooting
- Development
# 1. Clone and navigate to project
git clone <repo-url>
cd financial_agent_fork
# 2. Create Python virtual environment
python -m venv venv
.\venv\Scripts\activate # Windows
source venv/bin/activate # macOS/Linux
# 3. Install dependencies
pip install -r requirements.txt
# 4. Copy environment template
cp .env.example .env
# Edit .env with your settings
# 5. Setup database (PostgreSQL required)
# See Database Setup section below
# 6. Run the API server
python main.pyVisit http://localhost:8000/docs to test the API.
- Python: 3.9 or higher
- RAM: 8GB minimum (16GB recommended)
- Disk Space: 5GB free space
- OS: Windows 10+, macOS 10.14+, or Linux
-
PostgreSQL Database (v12 or higher)
- Local installation or cloud service (AWS RDS, Azure Database, etc.)
- At least 2GB storage recommended
-
LLM Provider (choose one)
- Google Gemini: Free API key from Google AI Studio
- Ollama: Local LLM server (free, no API key needed)
-
Qdrant Vector Database (choose one)
- Qdrant Cloud: Free tier available at cloud.qdrant.io
- Qdrant Local: Docker container or local installation
-
Optional: Tesseract OCR
- Required only for processing scanned PDF documents
- Installation Guide
git clone <repository-url>
cd financial_agent_fork# Windows
python -m venv venv
.\venv\Scripts\activate
# macOS/Linux
python -m venv venv
source venv/bin/activateVerify Python version:
python --version # Should be 3.9 or higher# Upgrade pip
python -m pip install --upgrade pip
# Install all required packages
pip install -r requirements.txtInstallation may take 5-10 minutes due to native dependencies
For processing scanned PDFs and images:
Windows:
# Download installer from:
# https://github.com/UB-Mannheim/tesseract/wiki/Downloads
# Then run setup and add to your .env:
TESSERACT_PATH=C:\Program Files\Tesseract-OCR\tesseract.exemacOS:
brew install tesseractLinux (Ubuntu/Debian):
sudo apt-get install tesseract-ocrpython -c "import langchain; print('✓ LangChain installed')"
python -c "import fastapi; print('✓ FastAPI installed')"
python -c "import vnstock; print('✓ VnStock installed')"
python -c "import qdrant_client; print('✓ Qdrant client installed')"This project uses PostgreSQL as the primary relational database, with Qdrant as the vector database for RAG features.
Windows:
-
Download PostgreSQL from postgresql.org
-
Run the installer and follow the installation wizard
-
Remember the superuser password
-
Verify installation:
psql --version
-
Connect to PostgreSQL:
psql -U postgres
macOS:
# Using Homebrew
brew install postgresql@15
# Start PostgreSQL service
brew services start postgresql@15
# Connect to PostgreSQL
psql postgresLinux (Ubuntu/Debian):
# Update package list
sudo apt-get update
# Install PostgreSQL
sudo apt-get install postgresql postgresql-contrib
# Start PostgreSQL service
sudo systemctl start postgresql
sudo systemctl enable postgresql
# Connect to PostgreSQL
sudo -u postgres psql# Run PostgreSQL container
docker run --name financial-db \
-e POSTGRES_USER=financial_user \
-e POSTGRES_PASSWORD=financial_password \
-e POSTGRES_DB=financial_agent \
-p 5432:5432 \
-v postgres_data:/var/lib/postgresql/data \
-d postgres:15
# Verify container is running
docker ps# Connect to PostgreSQL
psql -U postgres
# Inside psql shell:
CREATE USER financial_user WITH PASSWORD 'financial_password';
CREATE DATABASE financial_agent OWNER financial_user;
# Grant privileges
GRANT ALL PRIVILEGES ON DATABASE financial_agent TO financial_user;
# Connect to the new database
\c financial_agent
# Verify connection
\dtConnection String:
postgresql://financial_user:financial_password@localhost:5432/financial_agent
AWS RDS:
- Go to AWS RDS Console
- Click "Create Database"
- Select PostgreSQL engine
- Configure settings and note the endpoint
- Add connection string to
.env:DATABASE_URL=postgresql://username:password@endpoint:5432/financial_agent
Azure Database for PostgreSQL:
- Go to Azure Portal
- Create new "Azure Database for PostgreSQL"
- Configure and get connection details
- Add to
.env
Supabase (PostgreSQL as a Service):
- Sign up at supabase.com
- Create new project
- Copy connection string from project settings
- Add to
.env:DATABASE_URL=postgresql://[user]:[password]@[host]:[port]/[database]
After PostgreSQL is ready, initialize the application database:
# Navigate to project root
cd financial_agent_fork
# Run migrations using Alembic
alembic upgrade headExpected Output:
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
INFO [alembic.runtime.migration] Running upgrade -> xxxxx, Initial migration
# Connect to database
psql -U financial_user -d financial_agent -h localhost
# List all tables
\dt
# Expected tables:
# - users
# - chat_sessions
# - chat_messages
# - audit_logs
# - document_uploads
# Exit psql
\qQdrant stores vector embeddings for RAG (Retrieval Augmented Generation) features.
-
Sign Up: Go to cloud.qdrant.io
-
Create Cluster:
- Click "Create Cluster"
- Select region (choose closest to your location)
- Name:
financial-agentor similar - Free tier available for testing
-
Get Credentials:
- Copy the API Key and Cluster URL
- Add to
.env:QDRANT_MODE=cloud QDRANT_CLOUD_URL=https://your-cluster.qdrant.io QDRANT_CLOUD_API_KEY=your-api-key
-
Verify Connection:
python -c "from qdrant_client import QdrantClient; c = QdrantClient(url='YOUR_URL', api_key='YOUR_KEY'); print('✓ Qdrant connected')"
# Run Qdrant container
docker run --name qdrant \
-p 6333:6333 \
-p 6334:6334 \
-v qdrant_storage:/qdrant/storage \
-d qdrant/qdrant
# Verify container
docker ps
# Check web interface
# Visit http://localhost:6333/dashboardAdd to .env:
QDRANT_MODE=local
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=
# Download and run Qdrant locally
# Visit https://qdrant.tech/documentation/quick-start/ for platform-specific instructions
# macOS:
brew install qdrant
# Linux:
docker run -p 6333:6333 qdrant/qdrant# Copy the template
cp .env.example .envEdit .env with all required values:
# ==========================================
# DATABASE CONFIGURATION
# ==========================================
DATABASE_URL=postgresql://financial_user:financial_password@localhost:5432/financial_agent
JWT_SECRET_KEY=your-super-secret-key-change-this-in-production
ADMIN_USERNAME=admin
ADMIN_PASSWORD=your_secure_password_here
# ==========================================
# LLM PROVIDER CONFIGURATION
# ==========================================
LLM_PROVIDER=gemini # Options: 'gemini' or 'ollama'
GOOGLE_API_KEY=your_api_key # Required if using Gemini
LLM_MODEL=gemini-2.5-flash # Google Gemini model
# OR for Ollama:
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=qwen3:8b
# LLM Settings
LLM_TEMPERATURE=0.3
LLM_MAX_TOKENS=2048
# ==========================================
# QDRANT VECTOR DATABASE
# ==========================================
QDRANT_MODE=cloud # 'cloud' or 'local'
# Cloud Settings:
QDRANT_CLOUD_URL=https://your-instance.qdrant.io
QDRANT_CLOUD_API_KEY=your-qdrant-api-key
# OR Local Settings:
# QDRANT_URL=http://localhost:6333
# QDRANT_API_KEY=
# Timeout settings
QDRANT_TIMEOUT_SECONDS=120
QDRANT_RETRY_ATTEMPTS=3
QDRANT_RETRY_DELAY_SECONDS=2.0
# ==========================================
# EMBEDDING CONFIGURATION
# ==========================================
EMBEDDING_MODEL_FINANCIAL=fin-e5-small
EMBEDDING_MODEL_GENERAL=sentence-transformers/all-MiniLM-L6-v2
CHUNK_SIZE_TOKENS=512
CHUNK_OVERLAP_TOKENS=50
# ==========================================
# RAG CONFIGURATION
# ==========================================
ENABLE_RAG=True
RAG_PRIORITY_MODE=personal-first
RAG_SIMILARITY_THRESHOLD=0.1
RAG_TOP_K_RESULTS=20
RAG_MIN_RELEVANCE=0.3
RAG_MAX_DOCUMENTS=5
# ==========================================
# FEATURE FLAGS
# ==========================================
DEBUG=False
ENABLE_TOOLS=True
ENABLE_SUMMARIZATION=True
ENABLE_QUERY_REWRITING=True
# ==========================================
# API CONFIGURATION
# ==========================================
API_HOST=0.0.0.0
API_PORT=8000
CORS_ORIGINS=http://localhost:5173,http://localhost:3000,http://localhost:8000
# ==========================================
# RATE LIMITING
# ==========================================
RATE_LIMIT_REQUESTS=100
RATE_LIMIT_PERIOD_MINUTES=60python -c "from src.core.config import settings; print('✓ Configuration loaded'); print(f'DB: {settings.DATABASE_URL}'); print(f'LLM: {settings.LLM_PROVIDER}')"- ✅ Thông tin công ty: Tên công ty, ngành nghề, vốn điều lệ, lịch sử
- ✅ Cổ đông lớn: Top cổ đông với tỷ lệ sở hữu chi tiết
- ✅ Ban lãnh đạo: Danh sách lãnh đạo và tỷ lệ sở hữu
- ✅ Công ty con: Công ty con/liên kết với tỷ lệ nắm giữ
- ✅ Sự kiện công ty: Chia cổ tức, ĐHCĐ, tăng vốn...
- ✅ Giá lịch sử (OHLCV): Open, High, Low, Close, Volume
- Theo ngày cụ thể:
start_datevàend_date - Theo khoảng thời gian:
3M,6M,1Y - Hiển thị chi tiết dưới dạng bảng
- Theo ngày cụ thể:
- ✅ SMA (Simple Moving Average): Phân tích xu hướng giá
- Tính SMA với window tùy chỉnh (SMA-9, SMA-20, SMA-50...)
- So sánh giá với SMA, xác định xu hướng
- Hiển thị bảng chi tiết theo từng ngày
- ✅ RSI (Relative Strength Index): Đánh giá quá mua/quá bán
- RSI > 70: Quá mua (cảnh báo giảm)
- RSI < 30: Quá bán (cơ hội tăng)
- Hiển thị bảng chi tiết với trạng thái
-
✅ Phân tích Báo cáo Tài chính (Hình ảnh):
- OCR từ ảnh PDF/PNG/JPG
- Phân loại báo cáo: BCDN, KQKD, Dòng tiền, Chỉ số
- Trích xuất dữ liệu + tạo bảng Markdown
- Phân tích Gemini AI chi tiết
-
✅ Xử lý File PDF:
- Trích xuất text từ PDF native
- OCR tự động cho PDF scanned
- Bảng và dữ liệu có cấu trúc
- Phân tích thông minh với Gemini
-
✅ Phân tích File Excel:
- Chuyển đổi thành bảng Markdown
- Hỗ trợ nhiều sheet
- Định dạng số chuẩn Việt Nam
- Phân tích tài chính chi tiết
- 📋 Bảng Markdown với dữ liệu chi tiết, dễ đọc
- 📊 Thống kê tổng quan sau mỗi bảng
- 💡 Phân tích và kết luận chuyên nghiệp
- Backend: FastAPI (REST API)
- Agent Framework: LangChain + LangGraph (ReAct Pattern)
- LLM Providers:
- ☁️ Google Gemini (Cloud) - cho phân tích tài chính & OCR
- 🖥️ Ollama (Local) - cho chat & phân tích
- Data Source: VnStock3 API (Free)
- Technical Analysis: TA-Lib
- Document Processing:
- pytesseract + OpenCV (OCR)
- pdfplumber (PDF text extraction)
- pdf2image (PDF to image conversion)
- Excel Processing: openpyxl + pandas
- Frontend: React + Vite + TailwindCSS
financial_agent/
├── src/
│ ├── agent/ # LangGraph Agent
│ │ ├── financial_agent.py
│ │ ├── state.py
│ │ └── prompts/
│ │ ├── system_prompt.txt
│ │ ├── financial_report_prompt.txt
│ │ └── excel_analysis_prompt.txt
│ ├── tools/ # 11+ Tools
│ │ ├── vnstock_tools.py # 5 VnStock tools
│ │ ├── technical_tools.py # 2 Technical analysis tools
│ │ ├── financial_report_tools.py # Financial report analysis (OCR + Gemini)
│ │ ├── pdf_tools.py # PDF document processing
│ │ └── excel_tools.py # Excel analysis tools
│ ├── llm/ # LLM Factory
│ │ ├── llm_factory.py
│ │ └── config.py
│ └── api/ # FastAPI
│ └── app.py
├── frontend/ # React UI
│ ├── src/
│ │ ├── components/
│ │ └── App.jsx
│ └── package.json
├── tests/ # Unit Tests
├── test_auto.py # Automated Test Script
└── requirements.txt
# Clone hoặc cd vào thư mục
cd financial_agent
# Tạo virtual environment
python -m venv venv
# Kích hoạt venv
# Windows
venv\Scripts\activate
# Linux/Mac
source venv/bin/activate
# Cài đặt dependencies
pip install -r requirements.txtBạn có thể chọn 1 trong 2 provider:
Ưu điểm: Nhanh, mạnh mẽ, không cần GPU
- Lấy API key miễn phí tại: https://aistudio.google.com/apikey
- Cập nhật file
.env:
# Google Gemini
GOOGLE_API_KEY=your_api_key_here
LLM_PROVIDER=gemini
LLM_MODEL=gemini-2.5-flashƯu điểm: Chạy offline, bảo mật, miễn phí hoàn toàn
Yêu cầu: RAM >= 8GB (khuyến nghị 16GB), GPU có VRAM >= 4GB (tùy chọn)
Bước 1: Tải và cài đặt Ollama
-
Windows:
- Tải tại: https://ollama.com/download/windows
- Chạy file
OllamaSetup.exe - Cài đặt theo hướng dẫn (Next → Next → Install)
-
macOS:
brew install ollama
-
Linux:
curl -fsSL https://ollama.com/install.sh | sh
Bước 2: Khởi động Ollama
# Chạy Ollama server (sẽ tự động chạy ở background trên Windows)
ollama serveBước 3: Pull model
Chọn 1 trong các model sau (theo cấu hình máy):
# Model nhỏ (RAM 4-8GB) - Tốc độ nhanh
ollama pull qwen3:8b
# Model trung bình (RAM 8-16GB) - Cân bằng
ollama pull llama3.1:8b
ollama pull qwen2.5:7b
# Model lớn (RAM 16GB+, GPU 8GB+) - Chất lượng cao
ollama pull qwen3:14b
ollama pull llama3.1:70bBước 4: Kiểm tra model đã cài
ollama listBước 5: Cập nhật .env
# Ollama Local
LLM_PROVIDER=ollama
OLLAMA_MODEL=qwen3:8b # Thay bằng model bạn đã pull
OLLAMA_BASE_URL=http://localhost:11434Lưu ý Ollama:
- Model
qwen3:8b(8B parameters) cần ~8GB RAM - Model
llama3.1:8b(8B parameters) cần ~8GB RAM - Nếu gặp lỗi "out of memory", thử model nhỏ hơn hoặc chuyển sang Gemini
- Kiểm tra Ollama đang chạy:
ollama list
Tesseract được dùng để OCR hình ảnh báo cáo tài chính. Có thể bỏ qua nếu chỉ dùng Gemini Vision hoặc PDF native.
- Tải installer: https://github.com/UB-Mannheim/tesseract/wiki
- Chạy
tesseract-ocr-w64-setup-v5.x.exe - Cài đặt theo hướng dẫn (mặc định:
C:\Program Files\Tesseract-OCR) - Cập nhật
.env:
# Optional: Chỉ cần nếu install ở vị trí custom
TESSERACT_PATH=C:\Program Files\Tesseract-OCR\tesseract.exesudo apt-get install tesseract-ocr libtesseract-devbrew install tesseracttesseract --version# Activate venv (nếu chưa)
venv\Scripts\activate # Windows
source venv/bin/activate # Linux/Mac
# Chạy FastAPI server (sử dụng main.py để tự động load config)
python main.py
# Server chạy tại: http://localhost:8000# Terminal mới, cd vào frontend
cd frontend
# Cài đặt dependencies (lần đầu)
npm install
# Chạy dev server
npm run dev
# Frontend chạy tại: http://localhost:5173# Test endpoint
curl -X POST http://localhost:8000/api/chat \
-H "Content-Type: application/json" \
-d "{\"question\": \"Thông tin về VNM\"}"# Tạo file Excel mẫu với câu hỏi test
python create_sample_test.py
# Chạy test tự động (đảm bảo backend đang chạy)
python test_auto.py test_questions_sample.xlsx
# Kết quả sẽ được lưu trong test_results_[timestamp].xlsxChi tiết xem thư mục tests/ và ps_test/
Thông tin công ty:
- "Thông tin về công ty VNM"
- "VCB thuộc ngành gì?"
Cổ đông & Lãnh đạo:
- "Cổ đông lớn của VCB là ai?"
- "Ban lãnh đạo HPG gồm những ai?"
- "VNM có công ty con nào?"
Sự kiện:
- "Sự kiện gần đây của FPT"
- "VCB có chia cổ tức không?"
Dữ liệu giá:
- "Giá VCB 3 tháng gần nhất"
- "OHLCV của HPG từ đầu năm 2024"
Phân tích kỹ thuật:
- "Tính SMA-20 cho HPG"
- "Tính SMA-9 và SMA-20 của TCB từ đầu tháng 11"
- "RSI của VIC hiện tại"
- "HPG có quá mua không?"
Phân tích tổng hợp:
- "Phân tích toàn diện về VNM"
- "So sánh giá VCB và TCB trong 6 tháng"
Báo cáo tài chính (Hình ảnh):
Gửi hình ảnh báo cáo tài chính (BCDN, KQKD, Dòng tiền):
- Upload file PNG/JPG của báo cáo
- Agent sẽ OCR + phân tích + tạo bảng Markdown
File PDF:
Gửi file PDF báo cáo tài chính:
- Upload file PDF (native text hoặc scanned)
- Agent sẽ trích xuất text + bảng
- Phân tích chi tiết với AI
File Excel:
Gửi file Excel dữ liệu tài chính:
- Upload file .xlsx/.xls
- Agent sẽ chuyển đổi thành Markdown
- Phân tích dữ liệu tài chính
# Activate virtual environment
.\venv\Scripts\activate # Windows
source venv/bin/activate # macOS/Linux
# Start the FastAPI server
python main.pyExpected Output:
╔══════════════════════════════════════════════════════════════╗
║ Financial Agent API ║
║ Vietnamese Stock Market Investment Assistant ║
╚══════════════════════════════════════════════════════════════╝
🚀 Starting server...
📍 API Server: http://0.0.0.0:8000
📚 API Documentation (Swagger UI): http://0.0.0.0:8000/docs
...
Press CTRL+C to quit
In a new terminal:
# Test health check
curl http://localhost:8000/health
# Test chat endpoint
curl -X POST "http://localhost:8000/api/chat" \
-H "Content-Type: application/json" \
-d '{"question": "Thông tin về VNM"}'Open browser and visit: http://localhost:8000/docs
You can test all API endpoints interactively here.
# In a new terminal
cd frontend
# Install dependencies if not already done
npm install
# Start development server
npm run devFrontend will be available at: http://localhost:5173
# In a new terminal
cd desktop_app
# Setup (only first time)
npm install
# Start Electron app
npm startEdit .env to change which LLM is used:
# Google Gemini (Cloud)
LLM_PROVIDER=gemini
LLM_MODEL=gemini-2.5-flash
GOOGLE_API_KEY=your_api_key_here
# Ollama (Local)
LLM_PROVIDER=ollama
OLLAMA_MODEL=qwen3:8b
OLLAMA_BASE_URL=http://localhost:11434Important: Restart the server after changing .env
# Download and install from https://ollama.com/
# Start Ollama server
ollama serve
# In another terminal, pull a model
ollama pull qwen2.5:7b
# Verify installation
ollama listError: "Connection refused"
# Check if Ollama is running
ollama list
# If not running, start it
ollama serveError: "Out of memory"
- Use a smaller model:
ollama pull qwen3:4b - Switch to Gemini (cloud-based)
Error: "Model not found"
# List available models
ollama list
# Pull a new model
ollama pull qwen2.5:7bRecommended Models for Financial Analysis:
qwen2.5:7b- Best balance of quality and speedllama2:13b- High quality but slowerqwen3:4b- Fast but lower quality (~4GB RAM)
- Go to Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Copy the key and add to
.env:GOOGLE_API_KEY=your_key_here LLM_PROVIDER=gemini
Edit these files to customize agent behavior:
src/agent/prompts/system_prompt.txt- Main agent promptsrc/agent/prompts/financial_report_prompt.txt- Financial report analysissrc/agent/prompts/excel_analysis_prompt.txt- Excel data analysis
Restart server to apply changes.
# Temperature (0.0-1.0): Higher = more creative, Lower = more focused
LLM_TEMPERATURE=0.3
# Maximum length of response
LLM_MAX_TOKENS=2048
# RAG Threshold (0.0-1.0): How relevant documents must be
RAG_SIMILARITY_THRESHOLD=0.1
# Number of documents to retrieve
RAG_TOP_K_RESULTS=20Only needed for processing scanned PDFs:
Windows:
# Download from: https://github.com/UB-Mannheim/tesseract/wiki/Downloads
# Run installer
# Add to .env:
TESSERACT_PATH=C:\Program Files\Tesseract-OCR\tesseract.exe
macOS:
brew install tesseractLinux:
sudo apt-get install tesseract-ocr# Already installed via requirements.txt
# Verify installation
python -c "import talib; print('✓ TA-Lib installed')"GET /health
curl http://localhost:8000/healthResponse:
{
"status": "healthy",
"timestamp": "2025-01-11T10:30:00Z"
}POST /api/chat
Ask the financial agent any question about Vietnamese stocks.
curl -X POST "http://localhost:8000/api/chat" \
-H "Content-Type: application/json" \
-d '{
"question": "What is the latest price of VNM stock?"
}'Request Body:
{
"question": "Your question here",
"use_rag": true, // Optional: use RAG for document analysis
"session_id": "optional_session_id"
}Response:
{
"answer": "VNM (Vinamilk) stock information...\n\n| Date | Close | Volume |\n...",
"sources": ["VnStock API", "Company data"],
"processing_time_seconds": 2.5
}POST /api/upload/financial-report
Analyze financial reports from images (PNG, JPG, PDF).
curl -X POST "http://localhost:8000/api/upload/financial-report" \
-F "file=@financial_report.jpg"Response:
{
"success": true,
"report_type": "Balance Sheet",
"company": "ABC Corporation",
"period": "Q3/2024",
"extracted_text": "...",
"markdown_table": "| Item | Value |\n...",
"analysis": "Financial analysis from AI..."
}POST /api/upload/pdf
Analyze PDF financial documents.
curl -X POST "http://localhost:8000/api/upload/pdf" \
-F "[email protected]"Response:
{
"success": true,
"file_name": "report.pdf",
"total_pages": 5,
"extracted_text": "...",
"tables_markdown": "| Table | Data |\n...",
"analysis": "Detailed financial analysis...",
"processing_method": "native"
}POST /api/upload/excel
Analyze Excel financial data files.
curl -X POST "http://localhost:8000/api/upload/excel" \
-F "file=@financial_data.xlsx"Response:
{
"success": true,
"file_name": "financial_data.xlsx",
"sheet_count": 3,
"markdown": "# Financial Data Analysis\n\n## Sheet 1: Revenue\n| Month | Amount |\n...",
"message": "Excel file analysis successful"
}Visit http://localhost:8000/docs (Swagger UI) to:
- View all available endpoints
- Test endpoints with example data
- See response schemas
- Download API specification
Error: "Python version too old"
# Check your Python version
python --version
# Should be 3.9 or higher. If not, download from python.orgError: "pip install failed"
# Clear pip cache
pip cache purge
# Upgrade pip
python -m pip install --upgrade pip
# Try installing again
pip install -r requirements.txtError: "ModuleNotFoundError: No module named 'xxx'"
# Reinstall with force-reinstall
pip install -r requirements.txt --force-reinstall
# Or reinstall specific package
pip install langchain --upgradeError: "Connection refused" for PostgreSQL
# Check if PostgreSQL is running
# Windows: Services app → PostgreSQL → Should show "Running"
# macOS: brew services list | grep postgres
# Linux: sudo systemctl status postgresql
# If not running, start it:
# Windows: Services app → PostgreSQL → Start
# macOS: brew services start postgresql@15
# Linux: sudo systemctl start postgresqlError: "database does not exist"
# Recreate the database
psql -U postgres
CREATE DATABASE financial_agent;
GRANT ALL PRIVILEGES ON DATABASE financial_agent TO financial_user;
\q
# Run migrations
alembic upgrade headError: "Database URL is empty"
# Check .env file has DATABASE_URL
cat .env | grep DATABASE_URL
# Should see something like:
# DATABASE_URL=postgresql://financial_user:financial_password@localhost:5432/financial_agentError: "GOOGLE_API_KEY not configured"
# 1. Get API key from: https://aistudio.google.com/apikey
# 2. Add to .env:
GOOGLE_API_KEY=your_actual_key_here
LLM_PROVIDER=gemini
# 3. Restart serverError: "Ollama connection failed"
# Check if Ollama is running
ollama list
# Start Ollama if not running
ollama serve
# Update .env to point to correct URL
OLLAMA_BASE_URL=http://localhost:11434
LLM_PROVIDER=ollamaError: "Model not found"
# List available models
ollama list
# Pull a model
ollama pull qwen2.5:7b
# Set model in .env
OLLAMA_MODEL=qwen2.5:7bError: "Port 8000 already in use"
# Use different port
API_PORT=8001 python main.py
# Or find process using port 8000 and kill it
# Windows:
netstat -ano | findstr :8000
taskkill /PID <PID> /F
# macOS/Linux:
lsof -i :8000
kill -9 <PID>Error: "CORS error" from frontend
# Update .env with correct origins
CORS_ORIGINS=http://localhost:5173,http://localhost:3000,http://localhost:8000
# Restart serverError: "No module named 'src'"
# Make sure running from project root directory
cd financial_agent_fork
# Verify directory structure
ls -la src/ # Should show src/ folder exists
# Run server from root
python main.pyError: "Qdrant connection failed"
# Check Qdrant is running
curl http://localhost:6333/health
# If not running, start with Docker
docker run --name qdrant -p 6333:6333 qdrant/qdrant
# Or for Qdrant Cloud, update .env
QDRANT_MODE=cloud
QDRANT_CLOUD_URL=https://your-instance.qdrant.io
QDRANT_CLOUD_API_KEY=your-api-keyError: "Collection not found"
This is normal for first run. Collections are created automatically when first document is uploaded.
Error: "Timeout connecting to Qdrant"
# Increase timeout in .env
QDRANT_TIMEOUT_SECONDS=300
QDRANT_RETRY_ATTEMPTS=5
# Restart serverError: "File size too large"
- Default limit: 50MB per file
- For larger files, split into multiple smaller files
- Or adjust FastAPI settings
Error: "Unsupported file type"
- Financial Reports: PNG, JPG, PDF
- Data Files: XLSX, XLS
- PDF: PDF only
Error: "OCR failed" or "Tesseract not found"
# Option 1: Install Tesseract (see Installation Guide above)
# Option 2: Use Google Gemini Vision API instead (recommended)
# Set in .env:
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your_key_hereError: "PDF extraction failed"
- Try with a different PDF file
- Ensure PDF is not password-protected
- Scanned PDFs may need OCR (slower)
Error: "Excel file cannot be read"
- Verify file is not corrupted
- Save file in .xlsx format (not .xls)
- Check file has proper Excel structure
- Remove unusual blank rows/columns
API is slow to respond
# 1. Check if it's LLM latency
# - Switching to faster model (qwen3:4b)
# - Or use Gemini instead
# 2. Check if it's database query
# - Add database indexes
# - Check database server is running properly
# 3. Check RAM usage
# - Monitor memory with: Task Manager (Windows), Activity Monitor (macOS), htop (Linux)
# - If low on RAM, reduce model size
# 4. Enable debug mode to see timings
DEBUG=TrueHigh memory usage
# Use smaller LLM model
OLLAMA_MODEL=qwen3:4b # Instead of larger models
# Or switch to API-based (cloud) providers
LLM_PROVIDER=geminiCheck logs for detailed error messages:
# Windows: Logs are printed in terminal
# Look for error messages starting with [ERROR]
# Enable verbose logging
DEBUG=TrueTest individual components:
# Test VnStock API
python -c "from vnstock3 import Vnstock; v = Vnstock(); print(v.listing_companies())"
# Test PostgreSQL
python -c "from src.database.database import SessionLocal; db = SessionLocal(); print('✓ Database connected')"
# Test Qdrant
python -c "from qdrant_client import QdrantClient; c = QdrantClient(':memory:'); print('✓ Qdrant OK')"
# Test LLM
python -c "from src.llm.llm_factory import LLMFactory; llm = LLMFactory.get_llm(); print(llm.invoke('Hello'))"The financial agent has access to these tools for stock market analysis:
Get company overview and profile information
- Input:
ticker(e.g., VNM, VCB, HPG) - Output: Company name, industry, charter capital, history
Retrieve major shareholders information
- Input:
ticker - Output: Top 10 shareholders with ownership percentages
Get company leadership and management team
- Input:
ticker - Output: Executives, positions, shareholding percentage
Find subsidiary and affiliated companies
- Input:
ticker - Output: List of subsidiaries with ownership percentage
Get company events and announcements
- Input:
ticker - Output: Recent corporate events (dividends, AGM, capital increases)
Retrieve historical price data (OHLCV)
- Input:
ticker,start_date,end_dateorperiod(3M, 6M, 1Y) - Output: Detailed OHLCV table with statistics
- Example:
get_historical_data("VNM", period="3M")
Calculate Simple Moving Average
- Input:
ticker,window(default: 20) - Output: SMA values with trend analysis
- Example:
calculate_sma("VNM", window=20)
Calculate Relative Strength Index
- Input:
ticker,window(default: 14) - Output: RSI values with overbought/oversold signals
- Example:
calculate_rsi("HPG", window=14)
Simply ask the agent questions, and it will automatically use the appropriate tools:
Q: "What is the latest price of VNM?"
→ Uses get_historical_data
Q: "Who are the major shareholders of VCB?"
→ Uses get_shareholders
Q: "Calculate SMA-20 for HPG"
→ Uses calculate_sma with window=20
Q: "Is FPT stock overbought right now?"
→ Uses calculate_rsi to check signal
- Backend: FastAPI (REST API)
- Agent Framework: LangChain + LangGraph (ReAct Pattern)
- LLM Providers:
- ☁️ Google Gemini (Cloud) - AI analysis & OCR
- 🖥️ Ollama (Local) - for chat & analysis
- Data Source: VnStock3 API (Free)
- Vector Database: Qdrant (RAG)
- Relational Database: PostgreSQL
- Technical Analysis: TA-Lib
- Document Processing:
- pytesseract + OpenCV (OCR for scanned documents)
- pdfplumber (PDF text extraction)
- pdf2image (PDF to image conversion)
- Excel Processing: openpyxl + pandas
- Frontend: React + Vite + TailwindCSS
- Desktop App: Electron
financial_agent/
├── src/
│ ├── agent/ # LangGraph Agent
│ │ ├── financial_agent.py
│ │ ├── state.py
│ │ └── prompts/
│ │ ├── system_prompt.txt
│ │ ├── financial_report_prompt.txt
│ │ └── excel_analysis_prompt.txt
│ ├── tools/ # 8+ Analysis Tools
│ │ ├── vnstock_tools.py # Company & stock data
│ │ ├── technical_tools.py # SMA, RSI indicators
│ │ ├── financial_report_tools.py # OCR + Gemini analysis
│ │ ├── pdf_tools.py # PDF processing
│ │ └── excel_tools.py # Excel analysis
│ ├── llm/ # LLM Factory
│ │ ├── llm_factory.py
│ │ └── config.py
│ ├── database/ # Database Models
│ │ ├── database.py
│ │ └── models.py
│ ├── api/ # REST API Endpoints
│ │ └── app.py # Main FastAPI application
│ ├── services/ # Business Logic
│ │ ├── message_generation_service.py
│ │ ├── document_service.py
│ │ ├── admin_service.py
│ │ └── rag_service.py
│ ├── core/ # Configuration & Workflow
│ │ ├── config.py
│ │ ├── langgraph_workflow.py # Main workflow
│ │ └── tool_selector.py
│ └── utils/ # Utilities
│ ├── validators.py
│ └── helpers.py
├── migrations/ # Alembic DB migrations
├── frontend/ # React Frontend
│ ├── src/
│ │ ├── components/
│ │ ├── pages/
│ │ └── services/
│ └── package.json
├── desktop_app/ # Electron Desktop App
│ ├── main.js
│ ├── preload.js
│ └── package.json
├── requirements.txt # Python dependencies
├── alembic.ini # Database migration config
├── main.py # Application entry point
└── README.md # This file
User Input
↓
FastAPI Endpoint
↓
LangGraph Agent
├→ Tool Router
│ ├→ VnStock Tools (Stock data)
│ ├→ Technical Tools (SMA, RSI)
│ ├→ Financial Report Tools (OCR + AI)
│ ├→ PDF Tools (Document parsing)
│ └→ Excel Tools (Data analysis)
├→ LLM Provider
│ ├→ Google Gemini (Cloud)
│ └→ Ollama (Local)
└→ Qdrant Vector DB (RAG retrieval)
↓
Markdown Response
↓
Frontend Display
-- Users table
CREATE TABLE users (
id UUID PRIMARY KEY,
username VARCHAR UNIQUE,
email VARCHAR UNIQUE,
hashed_password VARCHAR,
is_admin BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP
);
-- Chat sessions
CREATE TABLE chat_sessions (
id UUID PRIMARY KEY,
user_id UUID FOREIGN KEY,
title VARCHAR,
use_rag BOOLEAN DEFAULT TRUE,
created_at TIMESTAMP
);
-- Chat messages
CREATE TABLE chat_messages (
id UUID PRIMARY KEY,
session_id UUID FOREIGN KEY,
role VARCHAR,
content TEXT,
created_at TIMESTAMP
);
-- Document uploads
CREATE TABLE document_uploads (
id UUID PRIMARY KEY,
user_id UUID FOREIGN KEY,
file_name VARCHAR,
file_type VARCHAR,
file_size INTEGER,
created_at TIMESTAMP
);
-- Audit logs
CREATE TABLE audit_logs (
id UUID PRIMARY KEY,
user_id UUID FOREIGN KEY,
action VARCHAR,
timestamp TIMESTAMP
);PostgreSQL ↔ FastAPI
- SQLAlchemy ORM for data modeling
- Alembic for schema migrations
- Connection pooling for performance
VnStock API ↔ Tools
- Real-time stock prices
- Historical OHLCV data
- Company fundamentals
- Shareholder information
LLM Providers ↔ Agent
- Gemini: For analysis and OCR
- Ollama: For local chat
- Tool calling and function execution
Qdrant ↔ RAG System
- Vector embeddings storage
- Semantic document retrieval
- Collection management
- VnStock: https://vnstocks.com/docs/vnstock
- LangChain: https://python.langchain.com/
- LangGraph: https://langchain-ai.github.io/langgraph/
- FastAPI: https://fastapi.tiangolo.com/
- PostgreSQL: https://www.postgresql.org/docs/
- Qdrant: https://qdrant.tech/documentation/
- TA-Lib Documentation: https://ta-lib.org/
- Investopedia: https://www.investopedia.com/
- Moving Averages: https://investopedia.com/terms/m/movingaverage.asp
- RSI Indicator: https://investopedia.com/terms/r/rsi.asp
- Ollama: https://ollama.com/
- Ollama Models: https://ollama.com/library
- Ollama GitHub: https://github.com/ollama/ollama
- Google Gemini: https://ai.google.dev/
- LangChain: https://www.langchain.com/
- Hugging Face: https://huggingface.co/
- Push code to GitHub
- Connect GitHub repository to Railway
- Set environment variables in Railway dashboard
- Railway automatically detects Python and deploys
- Push frontend code to GitHub
- Connect GitHub to Vercel
- Configure build settings
- Vercel auto-deploys on push
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "main.py"]Build and run:
docker build -t financial-agent .
docker run -p 8000:8000 --env-file .env financial-agentUser: "Tell me everything about VNM stock"
Agent:
1. Uses get_company_info("VNM")
2. Uses get_shareholders("VNM")
3. Uses get_company_events("VNM")
4. Uses get_historical_data("VNM", period="6M")
5. Uses calculate_sma("VNM", window=20)
6. Uses calculate_rsi("VNM")
7. LLM synthesizes all data
8. Returns comprehensive analysis with tables
User: Upload financial report image
Agent:
1. OCR image → Extract text
2. Classify report type (Balance Sheet, Income Statement, etc.)
3. Extract financial tables → Markdown
4. Use Gemini to analyze data
5. Return formatted analysis with insights
User: "I own VNM, VCB, and HPG. How are they doing?"
Agent:
1. Gets latest data for each stock
2. Calculates technical indicators
3. Analyzes trends and momentum
4. Compares to market benchmarks
5. Provides investment insights
- Real-time price updates (WebSocket)
- Financial ratio calculations (P/E, ROE, ROA)
- News scraping and sentiment analysis
- Portfolio tracking and alerts
- Mobile app (React Native)
- Advanced charting and visualization
- Machine learning price predictions
- Multi-language support
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
# Create virtual environment
python -m venv venv_dev
source venv_dev/bin/activate
# Install dev dependencies
pip install -r requirements.txt
pip install pytest pytest-asyncio black flake8
# Run tests
pytest
# Format code
black src/
# Lint code
flake8 src/MIT License
Financial Agent - AI Stock Market Assistant for Vietnam
Built with ❤️ using modern AI and financial technologies
Maintained by: Cleans3
Project Status: Active Development
Last Updated: January 2025
Special thanks to:
- VnStock team for the amazing free API
- LangChain team for the powerful framework
- Ollama team for local LLM support
- Google for Gemini API
- Open-source community
Happy Trading! 📈🚀
If you find this project helpful, please ⭐ star it on GitHub!