Table of Contents
- Gist
- Demos
- System Architecture: The Host-OpenClaw-Kernel Pattern
- Full Request Lifecycle
- Setup & Installation
- Cloud Infrastructure & Deployment
Search finds keywords. Sci-Trace finds foundations.
Sci-Trace is an autonomous research assistant that lives in the cloud and is accessible at any moment through Discord or Slack. Beyond general scientific dialogue, given a particular scientific concept, it can trace that concept's intellectual ancestry, recursively navigating the citation graph to surface the foundational papers that a modern work is built on. What would otherwise take hours of manual literature review takes minutes.
The system pairs OpenClaw, an autonomous AI agent, with a persistent Host server, both running on an AWS EC2 instance. OpenClaw handles general research queries directly and, when it detects the intent to trace a concept's lineage, utilizes a specialized tool to trigger the research process. This tool sends a request to the Host server to spawn a LangGraph agent (the Python Kernel) that fetches papers via the Semantic Scholar API, uses LLM reasoning to evaluate methodological significance at each step, and recursively walks the citation graph until it identifies a foundational root. Results and real-time progress are automatically streamed back to the originating Discord or Slack channel.
Key features:
- Natural language interaction via Discord and Slack, with autonomous intent detection
- Recursive citation graph traversal using the Semantic Scholar API
- LLM-powered evaluation of methodological significance at each step (chain-of-thought, parallel batching)
- Outputs a citation DAG image and a narrative lineage summary per trace
- Slash command and natural language entry points; both converge on the same specialized LangGraph agent tool
Simplified request flow:
User (Discord / Slack)
│
▼
OpenClaw Agent ──── intent analysis ────► direct response
│
│ lineage trace detected
▼
Host Server (Node.js, persistent)
│
│ spawns
▼
Python Kernel (transient)
│
├── Semantic Scholar API (paper fetch)
├── LLM eval (methodological significance, per candidate)
├── LangGraph state machine (recursive graph traversal)
├── Narrative synthesis
└── DAG image rendering
│
▼
Host Server ──► Discord / Slack
For a full technical deep-dive, see the Sci-Trace DeepWiki.
Agentic DiscoveryTraces in these demos were capped at 5 levels of depth and 5 parallel API queries at a time. Both are configurable.
Autonomous intent analysis and scholarly reasoning via natural language mentions.
demo-agent.mp4
Instantaneous research generation via structured /trace commands.
demo-trace.mp4
Sci-Trace automates the research tracing lifecycle: recursive graph traversal, LLM-powered methodological validation, and good-fidelity visual synthesis.
To ensure stability and responsiveness, Sci-Trace utilizes a decoupled, multi-layered architecture:
- The Body (Host): A persistent Node.js daemon that manages UI abstraction for Discord and Slack, session state, and the orchestration of background research tasks.
- The Persona (OpenClaw): A conversational agent acting as a Senior Research Fellow (formal, scholarly, and witty) that plans and reasons over user requests and triggers research tasks.
- The Brain (Kernel): A transient Python process powered by LangGraph and Pydantic AI. It handles the heavy-duty logic of fetching data from the Semantic Scholar API and reasoning over citation significance.
Three-layer architecture: The Host (Node.js body) routes slash commands directly, OpenClaw (Agent) autonomously interprets natural language and decides whether to trigger traces or respond directly, and the Kernel (Python brain) executes research tasks while querying external LLM and paper APIs.
The following sequence illustrates the autonomous handoff between the persistent chat interfaces and the ephemeral research kernel.
Requests flow through two paths: slash commands route directly to the Host bridge, while natural language messages flow through OpenClaw for intent analysis. Both paths converge at the research kernel, which reports progress via tagged stdout and returns artifacts.
The research kernel operates as a cyclic state machine, allowing it to recursively traverse the citation graph until it identifies a foundational root.
LangGraph state machine: Recursively searches for papers, filters references, evaluates candidates via Pydantic AI for methodological significance, and continues until a foundational root is identified. Finally synthesizes narrative results and generates visual citation graph.
- Node.js 20+ / Python 3.11+
uv(Python package manager)- AWS Account (for infrastructure)
Create a .env file in the root directory:
# --- Host (Discord & Slack) ---
DISCORD_TOKEN=...
DISCORD_CLIENT_ID=...
SLACK_BOT_TOKEN=...
SLACK_SIGNING_SECRET=...
# --- Kernel (LLM & Data) ---
OPENROUTER_API_KEY=...
SEMANTIC_SCHOLAR_API_KEY=...make installOnce the bot is running (npm start), use the slash command:
/trace topic: "Attention Is All You Need"
Or mention the bot:
@Research Assistant where did BERT come from?
Sci-Trace is designed with high availability in mind and is designed to operate autonomously in the cloud. It includes a complete Infrastructure as Code (IaC) suite for automated provisioning on AWS.
Terraform configurations are located in infra/. They provision:
- Provider: AWS
- Instance:
t3.mediumrunning Ubuntu 22.04 LTS - Bootstrap:
user_data.shinstalls Node.js 20, Python 3.11,uv, and PM2 on first boot
./deploy.sh <EC2_PUBLIC_IP> <PEM_KEY_PATH>Uses rsync to synchronize the codebase (excluding local environments) and performs remote setup for both the Kernel and the Host.
The Host daemon is managed by PM2, configured via ecosystem.config.js. Logs are written to host/logs/app.log. The process restarts automatically on crash or server reboot.


