Crypto Trading & Arbitrage Bot (Work In Progress)

Advanced automated trading and arbitrage bot with statistical arbitrage strategies and NLP-powered + LLM sentiment analysis.

Overview

This is a crypto trading bot that combines statistical arbitrage with NLP & LLM-powered sentiment analysis to identify and execute profitable trading opportunities. The bot supports multiple exchanges, real-time data collection, advanced risk management, and cloud-ready deployment.

Key Capabilities

Multi-Exchange Support: Binance, Kraken, and more
Statistical Arbitrage: Cointegration analysis, Z-score signals, mean reversion
Risk Management: Advanced position sizing, stop-loss, take-profit
Real-time Monitoring: Web dashboard, Prometheus metrics, Grafana
Backtesting: Historical strategy validation and optimization

Features

Trading Strategies

Statistical Arbitrage: Cointegration-based pair trading with OLS hedge ratios
Sentiment Analysis: Modular multi-model sentiment pipeline blending VADER, FinBERT, and optional batched LLM refinement with caching, confidence scoring, and entity-aware asset mapping with configurable aliases
Signal Combination: Multi-strategy signal fusion with consensus, weighted, and hybrid methods
Portfolio Rebalancing: Dynamic position management and correlation analysis
Mean Reversion: RSI, Bollinger Bands, MACD indicators
Enhanced ML: Machine learning signal filtering and optimization

Data Collection

Real-time Market Data: Price, volume, order book from multiple exchanges
Sentiment Sources: Reddit, news APIs, social media, CryptoPanic
Historical Data: Backtesting and strategy validation with customizable timeframes
Multi-Asset Support: ETH, SOL, BTC, and 20+ cryptocurrencies
WebSocket Streaming: High-frequency data collection for real-time analysis

Risk Management

Position Sizing: Dynamic allocation based on volatility and correlation
Stop-Loss/Take-Profit: Automated risk controls with trailing stops
Portfolio Limits: Maximum exposure and drawdown controls
Correlation Analysis: Diversification and risk mitigation
Volatility Monitoring: Real-time risk assessment and circuit breakers
Advanced Risk Manager: Comprehensive risk event tracking and database logging

Technical Features

Async Architecture: High-performance concurrent processing with asyncio
Modular Design: Pluggable strategies and components
Comprehensive Testing: 90%+ test coverage across all modules
Production Ready: Monitoring, logging, error handling, health checks
Scalable: Horizontal and vertical scaling support
Database Integration: SQLite, PostgreSQL, Redis caching
Sentiment Engine: Context-aware model blending, redis/in-memory caching, and batched LLM refinement hooks

Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Data Sources  │    │   Strategy      │    │   Execution     │
│                 │    │   Engine        │    │   Engine        │
│ • Exchanges     │───▶│ • Stat Arbitrage│───▶│ • Order Manager │
│ • News APIs     │    │ • Sentiment     │    │ • Risk Manager  │
│ • Social Media  │    │ • Signal Gen    │    │ • Position Mgr  │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         ▼                       ▼                       ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Data Store    │    │   Monitoring    │    │   Dashboard     │
│                 │    │                 │    │                 │
│ • PostgreSQL    │    │ • Prometheus    │    │ • Web UI        │
│ • Redis Cache   │    │ • Grafana       │    │ • Real-time     │
│ • CSV Files     │    │ • Health Checks │    │ • Performance   │
└─────────────────┘    └─────────────────┘    └─────────────────┘

Core Modules

src/data_collector.py: Multi-source data ingestion with WebSocket streaming
src/strategy/stat_arb.py: Statistical arbitrage with cointegration analysis
src/strategy/sentiment/: Modular sentiment engine with model blending, caching backends, refinement orchestration, and entity extraction
src/strategy/signal_generator.py: Advanced signal combination and portfolio optimization
src/execution/order_manager.py: Order execution with retry logic and slippage handling
src/execution/risk_manager.py: Comprehensive risk controls and event tracking
src/execution/position_manager.py: Position tracking and PnL calculation
src/backtesting/: Complete backtesting engine with performance analysis
src/ml/: Machine learning signal filtering and optimization
src/utils/monitoring.py: Metrics collection and health checks

Quick Start

Prerequisites

Python 3.9+
Docker (optional)
API keys for exchanges and services

1. Installation

# Clone the repository
git clone https://github.com/yourusername/crypto-trading-bot.git
cd crypto-trading-bot

# Setup development environment
./setup_dev.sh

# Or manually:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

2. Configuration

# Copy and edit configuration
cp config/config.yaml config/local.yaml

# Add your API keys to secrets file
nano config/secrets.env

Required API Keys:

# Exchange APIs
BINANCE_API_KEY=your_binance_api_key
BINANCE_SECRET_KEY=your_binance_secret_key
KRAKEN_API_KEY=your_kraken_api_key
KRAKEN_SECRET_KEY=your_kraken_secret_key

# Sentiment Analysis
OPENAI_API_KEY=your_openai_api_key
NEWSAPI_KEY=your_newsapi_key
REDDIT_CLIENT_ID=your_reddit_client_id
REDDIT_CLIENT_SECRET=your_reddit_client_secret
CRYPTOPANIC_API_KEY=your_cryptopanic_api_key

3. Collect Historical Data

# Collect historical data for backtesting
python get_historical_data.py --symbols ETH,BTC,SOL --exchanges binance,kraken --limit 5000

4. Run the Bot

# Simulation mode (default)
python run_bot.py --simulation

# Live trading mode
python run_bot.py --live

# Test mode
python run_bot.py --test

5. Start Dashboard

# Web dashboard
python dashboard.py

# Access at: http://localhost:5001

6. Run Backtesting

# Run backtest with default parameters
python run_backtest.py --days 180 --capital 100000

# Run with custom parameters
python run_backtest.py --days 90 --capital 50000 --z-threshold 1.5 --sentiment --plots

Configuration

Trading Parameters

# config/config.yaml
strategy:
  statistical_arbitrage:
    enabled: true
    z_score_threshold: 1.0
    cointegration_lookback: 50
    correlation_threshold: 0.3
    spread_model: 'ols'  # or 'kalman', 'rolling_ols'
    slippage: 0.0005
    
  sentiment_analysis:
    enabled: true
    model: "gpt-3.5-turbo"
    confidence_threshold: 0.5
    
  signal_generator:
    combination_method: "consensus"  # consensus, weighted, filter, hybrid
    stat_weight: 0.7
    sentiment_weight: 0.3
    min_confidence: 0.2
    sentiment_assets:
      - BTC
      - ETH
      - SOL
    sentiment_aliases:
      bitcoin: BTC
      ether: ETH
      solana: SOL
    sentiment_normalize_by_source: true

risk:
  max_position_size: 10000
  risk_per_trade: 0.10
  stop_loss_percentage: 0.08
  take_profit_percentage: 0.15
  max_total_exposure: 0.7
  max_daily_drawdown: 0.05
  max_total_drawdown: 0.15

sentiment_assets defines the tickers that receive dedicated sentiment aggregation, while sentiment_aliases maps free-form tokens, hashtags, or project names back to those assets. Set sentiment_normalize_by_source to balance scores from news, Reddit, Twitter (not used in testing due to extremely high API costs), and other feeds before the signal generator blends them with statistical signals.

Risk Management

Position Sizing: Dynamic allocation based on volatility and correlation
Stop-Loss: 8% default, configurable per strategy
Take-Profit: 15% default, trailing stops available
Portfolio Limits: 70% max exposure, 30% per asset
Drawdown Controls: 5% daily, 15% total maximum
Circuit Breakers: Automatic trading pause on risk events

Usage

Basic Usage

from src.main import TradingBot

# Initialize bot
bot = TradingBot()

# Start in simulation mode
await bot.start()

Strategy Examples

Statistical Arbitrage

from src.strategy.stat_arb import StatisticalArbitrage

# Initialize strategy
stat_arb = StatisticalArbitrage({
    'z_score_threshold': 1.0,
    'correlation_threshold': 0.3,
    'spread_model': 'ols'
})

# Generate signals
signals = await stat_arb.generate_signals()

Signal Generation

from src.strategy.signal_generator import SignalGenerator

# Initialize signal generator
signal_gen = SignalGenerator(
    stat_arb=stat_arb,
    sentiment_analyzer=sentiment,
    config={'combination_method': 'weighted'}
)

# Generate combined signals
signals = signal_gen.generate_signals(market_data, sentiment_data)

The generator now ingests per-asset sentiment payloads (score, confidence, and direction timestamp) derived from entity-aware keyword extraction. Configure tracked assets and alias maps to ensure hashtags, project names, or $TICKERS in sentiment feeds are routed to the right instruments before blending with statistical arbitrage signals.

Demo Applications

# Statistical arbitrage demo
python examples/stat_arb_demo.py

# Sentiment analysis demo
python examples/sentiment_demo.py

# Order management demo
python examples/order_manager_demo.py

# Risk management demo
python examples/risk_manager_demo.py

# Backtesting demo
python examples/backtest_demo.py

# Enhanced signal generator demo
python examples/signal_generator_enhanced_demo.py

API Documentation

REST Endpoints

GET /health - Health check
GET /metrics - Prometheus metrics
GET /status - Bot status and performance
GET /positions - Current positions
GET /trades - Recent trades
GET /signals - Generated signals
GET /risk/events - Risk events
GET /risk/metrics - Risk metrics

WebSocket Events

market_data - Real-time price updates
signal_generated - New trading signals
position_update - Position changes
trade_executed - Trade confirmations
risk_event - Risk management events

Configuration API

from src.utils.config_loader import config

# Get configuration values
trading_enabled = config.get('TRADING_ENABLED', 'false')
z_threshold = config.get('strategy.statistical_arbitrage.z_score_threshold', 2.0)

# Get exchange config
binance_config = config.get_exchange_config('binance')

Testing

Run All Tests

# Run with coverage
python -m pytest tests/ --cov=src --cov-report=html

# Run specific test suites
python -m pytest tests/test_stat_arb.py -v
python -m pytest tests/test_sentiment.py -v
python -m pytest tests/test_risk_manager.py -v
python -m pytest tests/test_order_manager.py -v
python -m pytest tests/test_backtesting.py -v

Test Coverage

Unit Tests: 90%+ coverage across all modules
Integration Tests: End-to-end workflows
Performance Tests: Load and stress testing
Security Tests: API key validation, input sanitization
Backtesting Tests: Strategy validation and optimization

Deployment

Docker Deployment

# Build image
docker build -t crypto-trading-bot .

# Run with Docker Compose
docker-compose -f docker/docker-compose.yml up -d

# Check status
docker-compose ps

# View logs
docker-compose logs -f trading-bot

Kubernetes Deployment

# Create namespace
kubectl apply -f k8s/namespace.yaml

# Deploy to cluster
kubectl apply -f k8s/

# Check deployment
kubectl get pods -n crypto-trading

# View logs
kubectl logs -f deployment/trading-bot -n crypto-trading

Cloud Deployment

AWS ECS

# Deploy to ECS
aws ecs create-service --cluster crypto-trading --service-name trading-bot

Google Cloud Run

# Deploy to Cloud Run
gcloud run deploy crypto-trading-bot --source .

Monitoring

Metrics Dashboard

Access Grafana at http://localhost:3000 (admin/admin)

Key Metrics:

Trading performance (PnL, win rate, Sharpe ratio)
System resources (CPU, memory, network)
API response times and error rates
Portfolio exposure and risk metrics
Signal generation and execution rates

Alerts

High Drawdown: >15% portfolio loss
API Errors: >5% error rate
System Resources: >80% CPU/memory usage
Trading Alerts: Large position changes
Risk Events: Circuit breaker triggers

Logging

Structured Logging: JSON format with correlation IDs
Log Levels: DEBUG, INFO, WARNING, ERROR
Log Rotation: Daily rotation with compression
Centralized Logging: ELK stack integration
Risk Event Logging: Database storage for audit trails

Development

Project Structure

crypto-trading-bot/
├── src/                    # Main source code
│   ├── strategy/          # Trading strategies
│   │   ├── stat_arb.py   # Statistical arbitrage
│   │   ├── sentiment/    # Sentiment engine (models, cache, extractor, analyzer)
│   │   └── signal_generator.py # Signal combination
│   ├── execution/         # Order execution
│   │   ├── order_manager.py # Order management
│   │   ├── risk_manager.py # Risk management
│   │   └── position_manager.py # Position tracking
│   ├── backtesting/       # Backtesting engine
│   ├── ml/               # Machine learning
│   ├── optimization/     # Strategy optimization
│   ├── utils/            # Utilities and helpers
│   └── main.py          # Main application
├── tests/                 # Test suite
├── examples/              # Demo applications
├── config/               # Configuration files
├── docker/               # Docker configurations
├── k8s/                  # Kubernetes manifests
├── data/                 # Data storage
├── logs/                 # Log files
└── scripts/              # Utility scripts

Development Setup

# Setup development environment
./setup_dev.sh

# Install pre-commit hooks
pre-commit install

# Run linting
flake8 src/ tests/
black src/ tests/

# Run type checking
mypy src/

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Performance

Infrastructure & Testing Environment

All live and stress-testing was performed on a dedicated low-latency compute node hosted in Amsterdam, directly connected to Tier-1 exchange gateways.
This setup replicates production-grade trading infrastructure, ensuring realistic benchmarking for market-data ingestion, signal latency, and execution performance.

Configuration

Component	Specification
CPU	Dual Intel Xeon Gold 6140 — 36 Cores @ 2.3 GHz
RAM	256 GB DDR4 ECC
Storage	2 × 1.92 TB NVMe SSD
Network	1 Gbps uplink / 50 TB traffic
Region	Amsterdam (EU Node) — direct peering with Binance EU and Kraken servers

Performance Context

This configuration enables:

Real-time multi-exchange streaming with sub-100 ms latency
Concurrent execution of statistical-arbitrage and sentiment modules
Full-scale backtesting on millions of data points without memory constraints
Parallel ML inference (FinBERT + LLM refinement) with async orchestration and Redis caching

All benchmarks and results presented in this repository were recorded under this configuration.

Benchmarks

Data Processing: 1000+ market data points/second
Signal Generation: <100ms latency
Order Execution: <50ms average
Memory Usage: <2GB typical
CPU Usage: <30% average
Backtesting Speed: 1000x faster than real-time

Scalability

Horizontal Scaling: Multiple bot instances
Vertical Scaling: Resource limits and requests
Database Scaling: Read replicas, sharding
Cache Scaling: Redis cluster, CDN
Load Balancing: Nginx reverse proxy

Security

Security Features

API Key Encryption: Secure storage and rotation
Network Security: TLS/SSL encryption
Input Validation: Sanitized user inputs
Rate Limiting: API abuse prevention
Audit Logging: Complete activity tracking
Risk Event Tracking: Database logging for compliance

Best Practices

Never commit API keys to version control
Use environment variables for secrets
Regularly rotate API keys
Monitor for suspicious activity
Keep dependencies updated
Use non-root containers in production

License

This project is licensed under a custom proprietary license — see the LICENSE file for details.

Disclaimer

This software, along with all associated files, data, and documentation (collectively, the “Software”), is provided strictly for educational and research purposes only. It is not intended to provide financial, investment, trading, or legal advice. Trading or investing in cryptocurrencies or any other financial instruments involves significant risk and can result in the complete loss of capital. You are solely responsible for your own decisions, actions, and results, and you should consult with licensed financial professionals before engaging in any trading activities.

The Software is provided “as is,” “with all faults,” and “as available,” without any express or implied warranties of any kind, including but not limited to warranties of merchantability, fitness for a particular purpose, accuracy, reliability, data integrity, or non‑infringement. No oral or written information or advice given by the author, contributors, or any affiliates shall create a warranty.

To the maximum extent permitted by applicable law, the author, contributors, and affiliated parties disclaim all liability for any direct, indirect, incidental, consequential, punitive, exemplary, or special damages (including but not limited to loss of profits, lost data, business interruption, or loss of goodwill) arising out of or in any way connected with the use, misuse, or inability to use the Software, even if the author or any party has been advised of the possibility of such damages.

By downloading, installing, or using this Software, you acknowledge that you have read and understood this disclaimer, agree to be bound by its terms, and accept full legal responsibility for all outcomes resulting from its use. If you do not agree with these terms, you must not use, copy, modify, or distribute the Software.

Performance

Past performance does not guarantee future results. Any historical or simulated results are provided for informational purposes only and should not be construed as a promise or guarantee of future performance. Always conduct independent backtesting and evaluation of any strategy before using it in live trading.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
config		config
data		data
docker		docker
docs		docs
examples		examples
k8s		k8s
notebooks		notebooks
scripts		scripts
src		src
templates		templates
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CLOUD_DEPLOYMENT.md		CLOUD_DEPLOYMENT.md
LICENSE		LICENSE
README.md		README.md
collect_multi_assets.py		collect_multi_assets.py
dashboard.py		dashboard.py
get_historical_data.py		get_historical_data.py
requirements.txt		requirements.txt
run_backtest.py		run_backtest.py
run_bot.py		run_bot.py
setup_dev.sh		setup_dev.sh

License

lucaskemper/CryptoTrading

Folders and files

Latest commit

History

Repository files navigation

Crypto Trading & Arbitrage Bot (Work In Progress)

Overview

Key Capabilities

Features

Trading Strategies

Data Collection

Risk Management

Technical Features

Architecture

Core Modules

Quick Start

Prerequisites

1. Installation

2. Configuration

3. Collect Historical Data

4. Run the Bot

5. Start Dashboard

6. Run Backtesting

Configuration

Trading Parameters

Risk Management

Usage

Basic Usage

Strategy Examples

Statistical Arbitrage

Signal Generation

Demo Applications

API Documentation

REST Endpoints

WebSocket Events

Configuration API

Testing

Run All Tests

Test Coverage

Deployment

Docker Deployment

Kubernetes Deployment

Cloud Deployment

AWS ECS

Google Cloud Run

Monitoring

Metrics Dashboard

Alerts

Logging

Development

Project Structure

Development Setup

Contributing

Performance

Infrastructure & Testing Environment

Configuration

Performance Context

Benchmarks

Scalability

Security

Security Features

Best Practices

License

Disclaimer

Performance

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages