A declarative, composable framework for building transparent LLM-powered systems through dataflow abstractions.
SAGE is a high-performance streaming framework for building AI-powered data processing pipelines. Transform complex LLM reasoning workflows into transparent, scalable, and maintainable systems through declarative dataflow abstractions.
Production-Ready: Built for enterprise-scale applications with distributed processing, fault tolerance, and comprehensive monitoring out of the box.
Developer Experience: Write complex AI pipelines in just a few lines of code with intuitive declarative APIs that eliminate boilerplate.
Performance: Optimized for high-throughput streaming workloads with intelligent memory management and parallel execution capabilities.
Transparency: Built-in observability and debugging tools provide complete visibility into execution paths and performance characteristics.
Transform rigid LLM applications into flexible, observable workflows. Traditional imperative approaches create brittle systems:
# Traditional approach - rigid and hard to modify
def traditional_rag(query):
docs = retriever.retrieve(query)
if len(docs) < 3:
docs = fallback_retriever.retrieve(query)
prompt = build_prompt(query, docs)
response = llm.generate(prompt)
return responseSAGE transforms this into a declarative, composable workflow:
from sage.core.api.local_environment import LocalEnvironment
from sage.libs.io_utils.source import FileSource
from sage.libs.rag.retriever import DenseRetriever
from sage.libs.rag.promptor import QAPromptor
from sage.libs.rag.generator import OpenAIGenerator
from sage.libs.io_utils.sink import TerminalSink
# Create execution environment
env = LocalEnvironment("rag_pipeline")
# Build declarative pipeline
(env
.from_source(FileSource, {"file_path": "questions.txt"})
.map(DenseRetriever, {"model": "sentence-transformers/all-MiniLM-L6-v2"})
.map(QAPromptor, {"template": "Answer based on context: {context}\nQ: {query}\nA:"})
.map(OpenAIGenerator, {"model": "gpt-3.5-turbo"})
.sink(TerminalSink)
)
# Execute pipeline
env.submit()Run a simple example to get started:
# Clone the repository
git clone https://github.com/intellistream/SAGE.git
cd SAGE
# Install with quickstart (recommended)
./quickstart.sh --dev --yes
# Run hello world example
python examples/tutorials/hello_world.py
# Check system status
sage doctorFlexibility: Modify pipeline structure without touching execution logic. Swap components, add monitoring, or change deployment targets effortlessly.
Transparency: See exactly what's happening at each step with built-in observability and debugging tools.
Performance: Automatic optimization, parallelization, and resource management based on dataflow analysis.
Reliability: Built-in fault tolerance, checkpointing, and error recovery mechanisms.
SAGE is built on a layered architecture that provides flexibility, scalability, and maintainability. The architecture consists of five main layers:
- User Layer: Applications built with SAGE (RAG, Agent, Memory, QA systems)
- API Layer: LocalEnvironment and RemoteEnvironment for different execution contexts
- Core Layer: Dispatcher, Job Manager, Service Manager, and Runtime execution engine
- Libraries Layer: RAG pipeline, Agent framework, Memory & Storage, Middleware components
- Infrastructure Layer: Compute backends (Ray, local), data storage, model services, monitoring
SAGE follows a clean separation of concerns with pluggable components that work together seamlessly:
- Core: Stream processing engine with execution environments
- Libraries: Rich operators for AI, I/O, transformations, and utilities
- Kernel: Distributed computing primitives and communication
- Middleware: Service discovery, monitoring, and management
- Common: Shared utilities, configuration, and logging
Built for real-world deployments with enterprise requirements:
- Distributed Execution: Scale across multiple nodes with automatic load balancing
- Fault Tolerance: Comprehensive error handling and recovery mechanisms
- Observability: Detailed metrics, logging, and performance monitoring
- Security: Authentication, authorization, and data encryption support
- Integration: Native connectors for popular databases, message queues, and AI services
We offer an interactive installer and explicit command flags. Developer mode is recommended when contributing.
Clone & Interactive Mode
git clone https://github.com/intellistream/SAGE.git
cd SAGE
./quickstart.sh # Opens interactive menuCommon Non-Interactive Modes
# Developer installation
./quickstart.sh --dev --yes
# Minimal core only
./quickstart.sh --minimal --yes
# Standard + vLLM support
./quickstart.sh --standard --vllm --yes
# Use system Python instead of conda
./quickstart.sh --minimal --pip --yes
# View all flags
./quickstart.sh --helpQuick PyPI Install
# Choose your installation mode:
pip install isage[minimal] # Core functionality
pip install isage[standard] # Full features
pip install isage[dev] # Everything + development toolsNote: PyPI install may not include all system dependencies; use quickstart.sh for complete environment setup.
Key Installation Features
- 🎯 Interactive menu for first-time users
- 🤖 Optional vLLM integration with
--vllm - 🐍 Supports conda or system Python via
--pip - ⚡ Three modes: minimal / standard / dev
After installation, configure your API keys and environment settings:
Quick Setup
# Run the interactive environment setup
python -m sage.tools.cli.main config env setupManual Setup
# Copy the environment template
cp .env.template .env
# Edit .env and add your API keys
# Required for most examples:
OPENAI_API_KEY=your_openai_api_key_here
HF_TOKEN=your_huggingface_token_hereEnvironment Variables
OPENAI_API_KEY: Required for GPT models and most LLM examplesHF_TOKEN: Required for Hugging Face model downloadsSILICONCLOUD_API_KEY: For alternative LLM servicesJINA_API_KEY: For embedding servicesALIBABA_API_KEY: For DashScope modelsSAGE_LOG_LEVEL: Set logging level (DEBUG, INFO, WARNING, ERROR)SAGE_TEST_MODE: Enable test mode for examples
Use the unified CLI to manage commercial licenses:
# View current license status
python -m sage.tools.cli.main license status
# Install or remove a commercial license
python -m sage.tools.cli.main license install <LICENSE-KEY>
python -m sage.tools.cli.main license remove
# Vendor utilities (SAGE team)
python -m sage.tools.cli.main license vendor generate "Customer" --days 365
python -m sage.tools.cli.main license vendor list
python -m sage.tools.cli.main license vendor revoke <LICENSE-KEY>API Key Sources
- Get OpenAI API key: https://platform.openai.com/api-keys
- Get Hugging Face token: https://huggingface.co/settings/tokens
The .env file is automatically ignored by git to keep your keys secure.
RAG Applications: Build production-ready retrieval-augmented generation systems with multi-modal support and advanced reasoning capabilities.
Real-Time Analytics: Process streaming data with AI-powered insights, anomaly detection, and automated decision making.
Data Pipeline Orchestration: Coordinate complex ETL workflows that seamlessly integrate AI components with traditional data processing.
Multi-Modal Processing: Handle text, images, audio, and structured data in unified pipelines with consistent APIs. 🆕 Advanced multimodal fusion enables intelligent combination of different data modalities for enhanced AI understanding and generation.
Distributed AI Inference: Scale AI model serving across multiple nodes with automatic load balancing and fault tolerance.
本地代码质量/测试请使用
sage dev quality或sage dev test,CI/CD 由 GitHub Workflows 自动完成。
- Documentation Site: https://intellistream.github.io/SAGE-Pub/
- Examples: examples/ (tutorials, rag, service, memory, etc.)
- Configurations: examples/config/ sample pipeline configs
- Quick Reference: docs/QUICK_REFERENCE.md
- Contribution Guide: CONTRIBUTING.md
- Changelog (planned): Add a
CHANGELOG.md(see suggestions below)
We welcome contributions! Please review the updated guidelines before opening a Pull Request.
Essential Links
- 🚀 Quick Reference: docs/QUICK_REFERENCE.md
- 📚 Contribution Guide: CONTRIBUTING.md
- 🐛 Issues & Features: GitHub Issues
- 💬 Discussions: GitHub Discussions
Quick Contributor Flow
git fetch origin
git checkout main-dev
git pull --ff-only origin main-dev
git checkout -b fix/<short-topic>
./quickstart.sh --dev --yes # ensure dev deps installed
bash tools/tests/run_examples_tests.sh
pytest -k issues_manager -vv
git add <changed-files>
git commit -m "fix(sage-kernel): correct dispatcher edge case"
git push -u origin fix/<short-topic>
# Open PR: include background / solution / tests / impactSee
CONTRIBUTING.mdfor full commit conventions, branch naming, and test matrices.
SAGE provides convenient Make-like commands for common development tasks:
# View all available commands
make help
# or
./dev.sh help
# Code quality
make lint # Run code checks
make format # Format code
make quality # Full quality check
# Testing
make test # Run all tests
make test-quick # Quick tests only
make test-all # Full test suite with coverage
# Build & Deploy
make build # Build packages
make clean # Clean build artifacts
make publish # Publish to TestPyPI
make version # Show current version
# Documentation
make docs # Build documentation
make docs-serve # Serve docs locallySee docs/dev-notes/DEV_COMMANDS.md for complete command reference and workflows.
Post-Install Diagnostics
sage doctor # Runs environment & module checks
python -c "import sage; print(sage.__version__)"Connect with other SAGE developers, get help, and stay updated on the latest developments:
💬 Join SAGE Community - Complete guide to all our communication channels
Quick links:
- WeChat Group: Scan QR codes for instant chat (Chinese/English)
- QQ Group: IntelliStream课题组讨论群
- Slack: Join our workspace
- GitHub Discussions: Technical Q&A and feature requests
We welcome questions, bug reports, feature requests, and contributions from developers worldwide!
SAGE is licensed under the MIT License.