A specialized AI model that combines chain-of-thought reasoning with cross-chain data analysis to understand and predict crypto market dynamics. Built on DeepSeek-R1-Distill-Qwen-14B and enhanced through GRPO (Group Policy Optimization), Cortex-1 aims to reason about market dynamics the way experienced traders do, but at a massive scale and with perfect recall of historical patterns spanning from 2018 to present day.
We believe in the power of open collaboration and are committed to making Cortex-1 fully accessible to the developer community:
- Open Source Dataset: Our hybrid dataset combining historical (2018-2023) and real-time data is publicly available, providing developers with high-quality, labeled examples of crypto market reasoning
- Open Model Weights: Once trained, the complete model weights will be open-sourced for the community
- Transparent Development: All training code, reward functions, and benchmarking tools are open source
- Developer-First: Built as a tool for developers to integrate advanced market reasoning into their applications, with the 14B model size specifically chosen to enable local execution on consumer hardware
Our goal is to create a foundation for the community to build upon, whether you're developing trading strategies, market analysis tools, or educational platforms.
- Chain-of-Thought Reasoning: Detailed step-by-step analysis of market conditions
- Cross-Chain Analysis: Deep understanding of relationships between different blockchain networks
- Quantitative Predictions: Data-driven forecasting with confidence intervals
- Risk Assessment: Comprehensive evaluation of technical, market, and systemic risks
- Opportunity Detection: Identification of market inefficiencies and arbitrage opportunities
- Historical Context: Leverages data from 2018-2023 plus real-time information for comprehensive analysis
-
Data Collection Layer
- Flipside Client: Fetches raw blockchain data
- Market Conditions: Analyzes and labels market states
- Historical Data: Incorporates data from 2018-2023
- Real-time Data: Integrates current market information
-
Synthetic Generation Layer
- DeepSeek R1 Integration: Uses R1's reasoning capabilities to generate high-quality examples
- Quality-Focused: Applies reward functions to verify example quality
- Multi-chain Data: Integrates data from various blockchain sources
- Template-based Prompts: Uses structured prompts to elicit detailed reasoning
-
Reward System
- Modular Design: Separate reward components for different aspects of quality
- Finance-Specific: Rewards for calculation accuracy, confidence intervals, and investment insights
- Format Quality: Rewards for citation format, structure, and completeness
- Composite Framework: Weighted combination of individual rewards
-
Model Training Layer
- DeepSeek-R1-Distill-Qwen-14B: 14.8B parameters, strong reasoning capabilities
- GRPO (Group Policy Optimization): Optimizes for reward maximization
- MLX Optimization: Leverages Apple Silicon for efficient training
- Local Training Capability: Can be fine-tuned on consumer hardware
Our data pipeline is designed with a clear separation between the main generation system and the testing components. For detailed information, see Data Pipeline Documentation.
-
Main Pipeline:
- Fetches real market data from Flipside
- Generates detailed reasoning using DeepSeek R1
- Applies validated reward functions for quality verification
- Creates standardized training examples with reasoning traces
-
Testing System:
- Uses mock examples to validate reward functions
- Provides controlled test cases of varying quality
- Ensures reward functions correctly differentiate quality levels
- Operates independently from the main pipeline
-
Market Analysis & Prediction
- Historical pattern recognition
- Cross-chain correlation analysis
- Transaction volume forecasting
- User behavior analysis
-
Protocol Analysis
- Performance metrics evaluation
- Growth trajectory analysis
- Risk factor assessment
- Optimization recommendations
-
Risk Management
- Technical risk quantification
- Market exposure analysis
- Systemic risk assessment
- Mitigation strategy development
-
Opportunity Discovery
- Arbitrage opportunity detection
- Yield optimization strategies
- Market inefficiency analysis
- Entry/exit point identification
We chose Microsoft's Phi-4 (14B) as our base model for several key reasons:
- Accessibility: 14B parameters can run on consumer hardware (32GB+ RAM, 16GB+ VRAM)
- Reasoning Capabilities: Strong performance on mathematics and logical reasoning tasks
- Context Window: 16K tokens is sufficient for financial analysis scenarios
- Quantization-Friendly: Works efficiently with 4-bit quantization for memory optimization
- Developer-First: Enables more contributors to run and fine-tune locally
This choice reinforces our commitment to creating a truly accessible, open-source model that developers can run on their own hardware.
-
Synthetic Data Generation
- DeepSeek R1 reasoning integration
- Market condition balancing
- Cross-chain correlation scenarios
- Protocol performance analysis cases
- Risk assessment simulations
-
Reward Function Components
- Finance-specific metrics (calculation accuracy, confidence intervals)
- Format quality (structure, completeness)
- Citation quality (metric citations, historical references)
- Composite reward framework with flexible weighting
-
Benchmarking Framework
- Historical prediction accuracy
- Reasoning quality metrics
- Cross-chain correlation accuracy
- Protocol analysis precision
- Real-world performance testing
- Python 3.10+
- CUDA-compatible GPU (16GB+ VRAM recommended)
- 32GB+ RAM for data preprocessing and training
- 100GB disk space for datasets and model weights
- Clone the repository:
git clone https://github.com/near/cortex-1.git
cd cortex-1
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
cp .env.example .env
# Edit .env with your API keys:
# - OPENROUTER_API_KEY (for DeepSeek R1 access)
# - FLIPSIDE_API_KEY (for market data)
# - WANDB_API_KEY (for experiment tracking)
# - HUGGINGFACE_TOKEN (for model downloading)
python scripts/test_flipside.py --days 30 --chains ethereum near
python scripts/generate_synthetic.py \
--dataset-size medium \
--chains market \
--verify-all
python scripts/test_rewards.py --verbose
python scripts/train_grpo.py \
--config configs/grpo/financial_reasoning.json \
--verbose
Cortex-1 is designed to run on accessible hardware:
Component | Minimum | Recommended | Notes |
---|---|---|---|
RAM | 32GB | 64GB | Required for data preprocessing |
GPU | 16GB VRAM | 24GB+ VRAM | A single NVIDIA RTX 3090/4090 or A6000 is sufficient |
CPU | 8 cores | 16+ cores | For data preparation tasks |
Storage | 100GB SSD | 250GB+ SSD | For datasets and model weights |
With 4-bit quantization, the Phi-4 model requires ~8GB of VRAM for inference and ~16GB for training with small batch sizes, making it feasible to run on a single consumer GPU.
Our comprehensive benchmarking suite evaluates:
-
Prediction Accuracy
- Transaction volume forecasting
- User growth projections
- Price movement predictions
- Cross-chain correlation accuracy
-
Reasoning Quality
- Chain-of-thought completeness
- Logical consistency
- Data citation accuracy
- Technical analysis depth
-
Real-World Performance
- Strategy backtesting
- Market simulation
- Live prediction tracking
- Cross-chain arbitrage detection
We welcome contributions! Here's how you can help:
-
Code Contributions
- Fork the repository
- Create a feature branch
- Submit a pull request
-
Data Contributions
- Historical market data
- Protocol performance metrics
- Cross-chain correlation data
- Benchmark test cases
-
Documentation
- Technical documentation
- Use case examples
- Benchmark results
- Tutorial creation
-
Model Development
- Fine-tuning improvements
- Synthetic data generation
- Reward function optimization
- Benchmarking scenarios
cortex-1/
βββ configs/ # Configuration files
β βββ grpo/ # GRPO configurations
β βββ data_config.yaml # Data configuration
β βββ model_config.yaml # Model configuration
βββ data/ # Data directories
β βββ raw/ # Raw examples with reasoning
β βββ training/ # Processed training examples
β βββ splits/ # Train/eval splits
βββ docs/ # Documentation
β βββ DATA_PIPELINE.md # Data pipeline documentation
β βββ TRAINING_PLAN.md # Training strategy documentation
βββ models/ # Saved model weights
β βββ phi4_financial_reasoning/ # Trained Phi-4 model
βββ scripts/ # Execution scripts
β βββ generate_synthetic.py # Data generation script
β βββ test_rewards.py # Reward testing script
β βββ train_grpo.py # GRPO training script
βββ src/ # Source code
β βββ data/ # Data processing modules
β βββ model/ # Model-related code
β βββ rewards/ # Reward function modules
βββ README.md # This file
This project is licensed under the MIT License. See the LICENSE file for details.
- NEAR Foundation for support and guidance
- Unsloth Team for GRPO implementation
- Flipside Crypto for market data access
- OpenRouter for DeepSeek R1 API access
- Microsoft for the Phi-4 model
For detailed documentation, visit our Wiki.
- NEAR Foundation
- Project Documentation
- Microsoft Phi-4
- Unsloth GRPO
- Training Plan
- Data Pipeline
- Contribution Guide
For questions or support, please open an issue or contact the NEAR Foundation team.
This repository now includes an implementation for fine-tuning the DeepSeek-R1-Distill-Llama-8B model on Apple Silicon (M1/M2/M3 Macs) for financial and crypto market analysis. This approach is specifically optimized for Macs with Apple Silicon and uses parameter-efficient fine-tuning techniques.
- Fine-tunes DeepSeek-R1-Distill-Llama-8B, a model with strong reasoning capabilities
- Optimized for Apple Silicon using Metal Performance Shaders (MPS)
- Uses Low-Rank Adaptation (LoRA) for memory-efficient training
- Includes comprehensive processing of financial analysis datasets
cd deepseek-mlx
./setup_and_train.sh
For more details, see the DeepSeek-MLX-README.md.