Advanced automated trading and arbitrage bot with statistical arbitrage strategies and NLP-powered + LLM sentiment analysis.
This is a crypto trading bot that combines statistical arbitrage with NLP & LLM-powered sentiment analysis to identify and execute profitable trading opportunities. The bot supports multiple exchanges, real-time data collection, advanced risk management, and cloud-ready deployment.
- Multi-Exchange Support: Binance, Kraken, and more
- Statistical Arbitrage: Cointegration analysis, Z-score signals, mean reversion
- Risk Management: Advanced position sizing, stop-loss, take-profit
- Real-time Monitoring: Web dashboard, Prometheus metrics, Grafana
- Backtesting: Historical strategy validation and optimization
- Statistical Arbitrage: Cointegration-based pair trading with OLS hedge ratios
- Sentiment Analysis: Modular multi-model sentiment pipeline blending VADER, FinBERT, and optional batched LLM refinement with caching, confidence scoring, and entity-aware asset mapping with configurable aliases
- Signal Combination: Multi-strategy signal fusion with consensus, weighted, and hybrid methods
- Portfolio Rebalancing: Dynamic position management and correlation analysis
- Mean Reversion: RSI, Bollinger Bands, MACD indicators
- Enhanced ML: Machine learning signal filtering and optimization
- Real-time Market Data: Price, volume, order book from multiple exchanges
- Sentiment Sources: Reddit, news APIs, social media, CryptoPanic
- Historical Data: Backtesting and strategy validation with customizable timeframes
- Multi-Asset Support: ETH, SOL, BTC, and 20+ cryptocurrencies
- WebSocket Streaming: High-frequency data collection for real-time analysis
- Position Sizing: Dynamic allocation based on volatility and correlation
- Stop-Loss/Take-Profit: Automated risk controls with trailing stops
- Portfolio Limits: Maximum exposure and drawdown controls
- Correlation Analysis: Diversification and risk mitigation
- Volatility Monitoring: Real-time risk assessment and circuit breakers
- Advanced Risk Manager: Comprehensive risk event tracking and database logging
- Async Architecture: High-performance concurrent processing with asyncio
- Modular Design: Pluggable strategies and components
- Comprehensive Testing: 90%+ test coverage across all modules
- Production Ready: Monitoring, logging, error handling, health checks
- Scalable: Horizontal and vertical scaling support
- Database Integration: SQLite, PostgreSQL, Redis caching
- Sentiment Engine: Context-aware model blending, redis/in-memory caching, and batched LLM refinement hooks
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Data Sources │ │ Strategy │ │ Execution │
│ │ │ Engine │ │ Engine │
│ • Exchanges │───▶│ • Stat Arbitrage│───▶│ • Order Manager │
│ • News APIs │ │ • Sentiment │ │ • Risk Manager │
│ • Social Media │ │ • Signal Gen │ │ • Position Mgr │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Data Store │ │ Monitoring │ │ Dashboard │
│ │ │ │ │ │
│ • PostgreSQL │ │ • Prometheus │ │ • Web UI │
│ • Redis Cache │ │ • Grafana │ │ • Real-time │
│ • CSV Files │ │ • Health Checks │ │ • Performance │
└─────────────────┘ └─────────────────┘ └─────────────────┘
src/data_collector.py: Multi-source data ingestion with WebSocket streamingsrc/strategy/stat_arb.py: Statistical arbitrage with cointegration analysissrc/strategy/sentiment/: Modular sentiment engine with model blending, caching backends, refinement orchestration, and entity extractionsrc/strategy/signal_generator.py: Advanced signal combination and portfolio optimizationsrc/execution/order_manager.py: Order execution with retry logic and slippage handlingsrc/execution/risk_manager.py: Comprehensive risk controls and event trackingsrc/execution/position_manager.py: Position tracking and PnL calculationsrc/backtesting/: Complete backtesting engine with performance analysissrc/ml/: Machine learning signal filtering and optimizationsrc/utils/monitoring.py: Metrics collection and health checks
- Python 3.9+
- Docker (optional)
- API keys for exchanges and services
# Clone the repository
git clone https://github.com/yourusername/crypto-trading-bot.git
cd crypto-trading-bot
# Setup development environment
./setup_dev.sh
# Or manually:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt# Copy and edit configuration
cp config/config.yaml config/local.yaml
# Add your API keys to secrets file
nano config/secrets.envRequired API Keys:
# Exchange APIs
BINANCE_API_KEY=your_binance_api_key
BINANCE_SECRET_KEY=your_binance_secret_key
KRAKEN_API_KEY=your_kraken_api_key
KRAKEN_SECRET_KEY=your_kraken_secret_key
# Sentiment Analysis
OPENAI_API_KEY=your_openai_api_key
NEWSAPI_KEY=your_newsapi_key
REDDIT_CLIENT_ID=your_reddit_client_id
REDDIT_CLIENT_SECRET=your_reddit_client_secret
CRYPTOPANIC_API_KEY=your_cryptopanic_api_key# Collect historical data for backtesting
python get_historical_data.py --symbols ETH,BTC,SOL --exchanges binance,kraken --limit 5000# Simulation mode (default)
python run_bot.py --simulation
# Live trading mode
python run_bot.py --live
# Test mode
python run_bot.py --test# Web dashboard
python dashboard.py
# Access at: http://localhost:5001# Run backtest with default parameters
python run_backtest.py --days 180 --capital 100000
# Run with custom parameters
python run_backtest.py --days 90 --capital 50000 --z-threshold 1.5 --sentiment --plots# config/config.yaml
strategy:
statistical_arbitrage:
enabled: true
z_score_threshold: 1.0
cointegration_lookback: 50
correlation_threshold: 0.3
spread_model: 'ols' # or 'kalman', 'rolling_ols'
slippage: 0.0005
sentiment_analysis:
enabled: true
model: "gpt-3.5-turbo"
confidence_threshold: 0.5
signal_generator:
combination_method: "consensus" # consensus, weighted, filter, hybrid
stat_weight: 0.7
sentiment_weight: 0.3
min_confidence: 0.2
sentiment_assets:
- BTC
- ETH
- SOL
sentiment_aliases:
bitcoin: BTC
ether: ETH
solana: SOL
sentiment_normalize_by_source: true
risk:
max_position_size: 10000
risk_per_trade: 0.10
stop_loss_percentage: 0.08
take_profit_percentage: 0.15
max_total_exposure: 0.7
max_daily_drawdown: 0.05
max_total_drawdown: 0.15sentiment_assets defines the tickers that receive dedicated sentiment aggregation, while
sentiment_aliases maps free-form tokens, hashtags, or project names back to those assets.
Set sentiment_normalize_by_source to balance scores from news, Reddit, Twitter (not used in testing due to extremely high API costs), and other
feeds before the signal generator blends them with statistical signals.
- Position Sizing: Dynamic allocation based on volatility and correlation
- Stop-Loss: 8% default, configurable per strategy
- Take-Profit: 15% default, trailing stops available
- Portfolio Limits: 70% max exposure, 30% per asset
- Drawdown Controls: 5% daily, 15% total maximum
- Circuit Breakers: Automatic trading pause on risk events
from src.main import TradingBot
# Initialize bot
bot = TradingBot()
# Start in simulation mode
await bot.start()from src.strategy.stat_arb import StatisticalArbitrage
# Initialize strategy
stat_arb = StatisticalArbitrage({
'z_score_threshold': 1.0,
'correlation_threshold': 0.3,
'spread_model': 'ols'
})
# Generate signals
signals = await stat_arb.generate_signals()from src.strategy.signal_generator import SignalGenerator
# Initialize signal generator
signal_gen = SignalGenerator(
stat_arb=stat_arb,
sentiment_analyzer=sentiment,
config={'combination_method': 'weighted'}
)
# Generate combined signals
signals = signal_gen.generate_signals(market_data, sentiment_data)The generator now ingests per-asset sentiment payloads (score, confidence, and direction
timestamp) derived from entity-aware keyword extraction. Configure tracked assets and alias
maps to ensure hashtags, project names, or $TICKERS in sentiment feeds are routed to the
right instruments before blending with statistical arbitrage signals.
# Statistical arbitrage demo
python examples/stat_arb_demo.py
# Sentiment analysis demo
python examples/sentiment_demo.py
# Order management demo
python examples/order_manager_demo.py
# Risk management demo
python examples/risk_manager_demo.py
# Backtesting demo
python examples/backtest_demo.py
# Enhanced signal generator demo
python examples/signal_generator_enhanced_demo.pyGET /health- Health checkGET /metrics- Prometheus metricsGET /status- Bot status and performanceGET /positions- Current positionsGET /trades- Recent tradesGET /signals- Generated signalsGET /risk/events- Risk eventsGET /risk/metrics- Risk metrics
market_data- Real-time price updatessignal_generated- New trading signalsposition_update- Position changestrade_executed- Trade confirmationsrisk_event- Risk management events
from src.utils.config_loader import config
# Get configuration values
trading_enabled = config.get('TRADING_ENABLED', 'false')
z_threshold = config.get('strategy.statistical_arbitrage.z_score_threshold', 2.0)
# Get exchange config
binance_config = config.get_exchange_config('binance')# Run with coverage
python -m pytest tests/ --cov=src --cov-report=html
# Run specific test suites
python -m pytest tests/test_stat_arb.py -v
python -m pytest tests/test_sentiment.py -v
python -m pytest tests/test_risk_manager.py -v
python -m pytest tests/test_order_manager.py -v
python -m pytest tests/test_backtesting.py -v- Unit Tests: 90%+ coverage across all modules
- Integration Tests: End-to-end workflows
- Performance Tests: Load and stress testing
- Security Tests: API key validation, input sanitization
- Backtesting Tests: Strategy validation and optimization
# Build image
docker build -t crypto-trading-bot .
# Run with Docker Compose
docker-compose -f docker/docker-compose.yml up -d
# Check status
docker-compose ps
# View logs
docker-compose logs -f trading-bot# Create namespace
kubectl apply -f k8s/namespace.yaml
# Deploy to cluster
kubectl apply -f k8s/
# Check deployment
kubectl get pods -n crypto-trading
# View logs
kubectl logs -f deployment/trading-bot -n crypto-trading# Deploy to ECS
aws ecs create-service --cluster crypto-trading --service-name trading-bot# Deploy to Cloud Run
gcloud run deploy crypto-trading-bot --source .Access Grafana at http://localhost:3000 (admin/admin)
Key Metrics:
- Trading performance (PnL, win rate, Sharpe ratio)
- System resources (CPU, memory, network)
- API response times and error rates
- Portfolio exposure and risk metrics
- Signal generation and execution rates
- High Drawdown: >15% portfolio loss
- API Errors: >5% error rate
- System Resources: >80% CPU/memory usage
- Trading Alerts: Large position changes
- Risk Events: Circuit breaker triggers
- Structured Logging: JSON format with correlation IDs
- Log Levels: DEBUG, INFO, WARNING, ERROR
- Log Rotation: Daily rotation with compression
- Centralized Logging: ELK stack integration
- Risk Event Logging: Database storage for audit trails
crypto-trading-bot/
├── src/ # Main source code
│ ├── strategy/ # Trading strategies
│ │ ├── stat_arb.py # Statistical arbitrage
│ │ ├── sentiment/ # Sentiment engine (models, cache, extractor, analyzer)
│ │ └── signal_generator.py # Signal combination
│ ├── execution/ # Order execution
│ │ ├── order_manager.py # Order management
│ │ ├── risk_manager.py # Risk management
│ │ └── position_manager.py # Position tracking
│ ├── backtesting/ # Backtesting engine
│ ├── ml/ # Machine learning
│ ├── optimization/ # Strategy optimization
│ ├── utils/ # Utilities and helpers
│ └── main.py # Main application
├── tests/ # Test suite
├── examples/ # Demo applications
├── config/ # Configuration files
├── docker/ # Docker configurations
├── k8s/ # Kubernetes manifests
├── data/ # Data storage
├── logs/ # Log files
└── scripts/ # Utility scripts
# Setup development environment
./setup_dev.sh
# Install pre-commit hooks
pre-commit install
# Run linting
flake8 src/ tests/
black src/ tests/
# Run type checking
mypy src/- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
All live and stress-testing was performed on a dedicated low-latency compute node hosted in Amsterdam, directly connected to Tier-1 exchange gateways.
This setup replicates production-grade trading infrastructure, ensuring realistic benchmarking for market-data ingestion, signal latency, and execution performance.
| Component | Specification |
|---|---|
| CPU | Dual Intel Xeon Gold 6140 — 36 Cores @ 2.3 GHz |
| RAM | 256 GB DDR4 ECC |
| Storage | 2 × 1.92 TB NVMe SSD |
| Network | 1 Gbps uplink / 50 TB traffic |
| Region | Amsterdam (EU Node) — direct peering with Binance EU and Kraken servers |
This configuration enables:
- Real-time multi-exchange streaming with sub-100 ms latency
- Concurrent execution of statistical-arbitrage and sentiment modules
- Full-scale backtesting on millions of data points without memory constraints
- Parallel ML inference (FinBERT + LLM refinement) with async orchestration and Redis caching
All benchmarks and results presented in this repository were recorded under this configuration.
- Data Processing: 1000+ market data points/second
- Signal Generation: <100ms latency
- Order Execution: <50ms average
- Memory Usage: <2GB typical
- CPU Usage: <30% average
- Backtesting Speed: 1000x faster than real-time
- Horizontal Scaling: Multiple bot instances
- Vertical Scaling: Resource limits and requests
- Database Scaling: Read replicas, sharding
- Cache Scaling: Redis cluster, CDN
- Load Balancing: Nginx reverse proxy
- API Key Encryption: Secure storage and rotation
- Network Security: TLS/SSL encryption
- Input Validation: Sanitized user inputs
- Rate Limiting: API abuse prevention
- Audit Logging: Complete activity tracking
- Risk Event Tracking: Database logging for compliance
- Never commit API keys to version control
- Use environment variables for secrets
- Regularly rotate API keys
- Monitor for suspicious activity
- Keep dependencies updated
- Use non-root containers in production
This project is licensed under a custom proprietary license — see the LICENSE file for details.
This software, along with all associated files, data, and documentation (collectively, the “Software”), is provided strictly for educational and research purposes only. It is not intended to provide financial, investment, trading, or legal advice. Trading or investing in cryptocurrencies or any other financial instruments involves significant risk and can result in the complete loss of capital. You are solely responsible for your own decisions, actions, and results, and you should consult with licensed financial professionals before engaging in any trading activities.
The Software is provided “as is,” “with all faults,” and “as available,” without any express or implied warranties of any kind, including but not limited to warranties of merchantability, fitness for a particular purpose, accuracy, reliability, data integrity, or non‑infringement. No oral or written information or advice given by the author, contributors, or any affiliates shall create a warranty.
To the maximum extent permitted by applicable law, the author, contributors, and affiliated parties disclaim all liability for any direct, indirect, incidental, consequential, punitive, exemplary, or special damages (including but not limited to loss of profits, lost data, business interruption, or loss of goodwill) arising out of or in any way connected with the use, misuse, or inability to use the Software, even if the author or any party has been advised of the possibility of such damages.
By downloading, installing, or using this Software, you acknowledge that you have read and understood this disclaimer, agree to be bound by its terms, and accept full legal responsibility for all outcomes resulting from its use. If you do not agree with these terms, you must not use, copy, modify, or distribute the Software.
Past performance does not guarantee future results. Any historical or simulated results are provided for informational purposes only and should not be construed as a promise or guarantee of future performance. Always conduct independent backtesting and evaluation of any strategy before using it in live trading.