Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,740 changes: 1,740 additions & 0 deletions docs/superpowers/plans/2026-04-09-understand-knowledge.md

Large diffs are not rendered by default.

776 changes: 776 additions & 0 deletions docs/superpowers/plans/2026-04-10-understandignore-impl.md

Large diffs are not rendered by default.

335 changes: 335 additions & 0 deletions docs/superpowers/specs/2026-04-09-understand-knowledge-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,335 @@
# /understand-knowledge — Personal Knowledge Base Plugin Design

## Overview

A new `/understand-knowledge` skill within the existing Understand Anything plugin that takes any folder of markdown notes and produces an interactive knowledge graph visualized in the existing dashboard.

Inspired by Andrej Karpathy's LLM Wiki pattern — where an LLM compiles and maintains a structured wiki from raw sources — this plugin goes further by adding typed relationship discovery and interactive graph visualization that tools like Obsidian and Logseq cannot provide.

### Goals

- Accept any markdown-based knowledge base (Obsidian vault, Logseq graph, Dendron workspace, Foam, Karpathy-style LLM wiki, Zettelkasten, or plain markdown)
- Auto-detect the format and adapt parsing accordingly
- Use LLM analysis to discover implicit relationships beyond explicit links
- Produce a knowledge graph with typed nodes and edges
- Visualize in the existing dashboard with knowledge-specific layout, sidebar, and reading mode

### Non-Goals

- Real-time sync with the knowledge base tool (Obsidian, Logseq, etc.)
- Replacing the user's existing PKM tool — this is a visualization/analysis layer on top
- Supporting non-markdown formats (PDFs, bookmarks) in v1

---

## Schema Extensions

### New Node Types (5)

Added to the existing `NodeType` union (currently 16 types):

```typescript
export type NodeType =
// existing (16)
| "file" | "function" | "class" | "module" | "concept"
| "config" | "document" | "service" | "table" | "endpoint"
| "pipeline" | "schema" | "resource"
| "domain" | "flow" | "step"
// knowledge (5 new → 21 total)
| "article" | "entity" | "topic" | "claim" | "source";
```

| Type | What it represents | Example |
|------|-------------------|---------|
| `article` | A wiki/note page — the primary content unit | "LLM Knowledge Bases.md" |
| `entity` | A named thing: person, tool, paper, org, project | "Andrej Karpathy", "Obsidian" |
| `topic` | A thematic cluster grouping related articles | "Personal Knowledge Management" |
| `claim` | A specific assertion, insight, or takeaway | "RAG loses context at chunk boundaries" |
| `source` | Raw/reference material that articles are compiled from | A paper URL, a raw PDF reference |

### New Edge Types (6)

Added to the existing `EdgeType` union (currently 29 types):

```typescript
export type EdgeType =
// existing (29)
| ...
// knowledge (6 new → 35 total)
| "cites" | "contradicts" | "builds_on"
| "exemplifies" | "categorized_under" | "authored_by";
```

| Type | Direction | Meaning |
|------|-----------|---------|
| `cites` | article → source | References or draws from |
| `contradicts` | claim → claim | Conflicts or disagrees with |
| `builds_on` | article → article | Extends, refines, or deepens |
| `exemplifies` | entity → concept/topic | Is a concrete example of |
| `categorized_under` | article/entity → topic | Belongs to this theme |
| `authored_by` | article → entity | Written or created by |

### New Metadata Interface

```typescript
export interface KnowledgeMeta {
format?: "obsidian" | "logseq" | "dendron" | "foam" | "karpathy" | "zettelkasten" | "plain";
wikilinks?: string[];
backlinks?: string[];
frontmatter?: Record<string, unknown>;
sourceUrl?: string;
confidence?: number; // 0-1, for LLM-inferred relationships
}
```

Added as an optional field on `GraphNode`:

```typescript
export interface GraphNode {
// ...existing fields
knowledgeMeta?: KnowledgeMeta;
}
```

### Graph-Level Kind Flag

```typescript
export interface KnowledgeGraph {
version: string;
kind: "codebase" | "knowledge"; // NEW
project: ProjectMeta;
nodes: GraphNode[];
edges: GraphEdge[];
layers: Layer[];
tour: TourStep[];
}
```

The `kind` field tells the dashboard which layout, sidebar, and visual styling to use. For backward compatibility, graphs without a `kind` field default to `"codebase"`.

---

## Format Detection & Format Guides

### Auto-Detection Logic

Scans the target directory for signature files/patterns. Priority order (first match wins):

| Priority | Signal | Detected Format |
|----------|--------|----------------|
| 1 | `.obsidian/` directory | Obsidian |
| 2 | `logseq/` + `pages/` directories | Logseq |
| 3 | `.dendron.yml` or `*.schema.yml` | Dendron |
| 4 | `.foam/` or `.vscode/foam.json` | Foam |
| 5 | `raw/` + `wiki/` + `index.md` | Karpathy |
| 6 | `[[wikilinks]]` + unique ID prefixes in filenames | Zettelkasten |
| 7 | Fallback | Plain markdown |

### Format Guides

Located at `skills/understand-knowledge/formats/`. Each guide tells the LLM agents how to parse that format:

```
skills/understand-knowledge/
SKILL.md
formats/
obsidian.md — [[wikilinks]], [[note|alias]], [[note#heading]],
#tags, YAML frontmatter, .obsidian/ config,
dataview annotations, canvas files
logseq.md — block-based outliner, ((block-refs)),
journals/YYYY_MM_DD.md, pages/,
property:: value syntax, TODO/DONE states
dendron.md — dot-delimited hierarchy (a.b.c.md),
.schema.yml for structure validation,
cross-vault links, refactoring rules
foam.md — [[wikilinks]] + link reference definitions
at file bottom, .foam/config, placeholder links
karpathy.md — raw/ → wiki/ pipeline, index.md master map,
log.md append-only record, _meta/ state,
LLM-maintained cross-references
zettelkasten.md — atomic notes, unique ID prefixes (timestamps),
typed semantic links, one idea per note
plain.md — standard [markdown](links), folder hierarchy,
heading structure, no special conventions
```

Each format guide covers:
- How to parse links (wikilinks vs standard vs block refs)
- Where metadata lives (frontmatter vs inline properties vs block properties)
- What the folder structure means (journals/ = daily notes, pages/ = permanent notes)
- What conventions to respect vs what to infer

### Format Guide Authoring Process

Format guides must be research-backed. During implementation, the agent building each format guide must:
1. Read the official documentation for that format (Obsidian Help, Logseq docs, Dendron wiki, Foam docs, etc.)
2. Study real-world examples of that format's structure
3. Write the guide based on verified behavior, not assumptions

---

## Agent Pipeline

```
knowledge-scanner → format-detector → article-analyzer → relationship-builder → graph-reviewer
```

### Agent Definitions

| Agent | Input | Output | Model |
|-------|-------|--------|-------|
| `knowledge-scanner` | Target directory path | File manifest: all `.md` files with paths, sizes, first 20 lines preview | `inherit` |
| `format-detector` | File manifest + directory structure | Detected format + format-specific parsing hints | `inherit` |
| `article-analyzer` | Individual `.md` file + format guide | Per-file nodes (article, entities, claims) + explicit edges (wikilinks, tags) | `inherit` |
| `relationship-builder` | All per-file results | Cross-file implicit edges (builds_on, contradicts, categorized_under) + topic clustering + layers | `inherit` |
| `graph-reviewer` | Assembled graph | Validated graph — deduped entities, consistent edge weights, orphan detection | `inherit` |

### Key Differences from Codebase Pipeline

- **No tree-sitter** — markdown parsing is simpler, mostly regex + LLM interpretation
- **format-detector** replaces framework detection — picks the right format guide
- **article-analyzer** replaces file-analyzer — extracts knowledge concepts instead of code structure
- **relationship-builder** is the heavy LLM step — discovers implicit connections across files that explicit links miss
- **graph-reviewer** stays similar — validates the assembled graph for consistency

### Intermediate Files

Same pattern as codebase analysis:

```
.understand-anything/intermediate/
knowledge-manifest.json — scanner output
format-detection.json — detected format + hints
article-*.json — per-file analysis
relationships.json — cross-file edges
knowledge-graph.json — final assembled graph
```

Intermediate files are cleaned up after graph assembly (same as codebase flow).

### Incremental Mode (`--ingest`)

When the user runs `/understand-knowledge --ingest path/to/new-source.md`:

1. **knowledge-scanner** — runs on just the new file(s)
2. **format-detector** — skipped (format already known from initial scan)
3. **article-analyzer** — processes only new/changed files
4. **relationship-builder** — runs on new nodes against the existing graph, finds connections to what's already there
5. **graph-reviewer** — validates the merged result

Existing nodes are preserved; only new nodes/edges are added or updated.

---

## Dashboard Changes

All changes are scoped to graphs with `"kind": "knowledge"`.

### Vertical Flow Layout

- Default to top-down vertical layout (like existing domain/business flow view)
- Topics at top → articles in middle → entities/claims/sources at bottom
- Reads like a knowledge hierarchy: broad themes flow down into specifics
- User can still switch to horizontal or force-directed layout via controls

### Knowledge Sidebar

Replaces NodeInfo when a knowledge graph is loaded:

| Selection | Sidebar Shows |
|-----------|---------------|
| Nothing selected | ProjectOverview: format detected, total articles/entities/topics/claims/sources |
| Article node | Title, summary, tags, frontmatter metadata, backlinks list (clickable), outgoing links, related topics |
| Entity node | Name, type (person/tool/paper/org), articles that mention it, relationships to other entities |
| Topic node | Description, child articles, child entities, cross-topic connections |
| Claim node | Assertion text, supporting articles, contradicting claims (if any), confidence score |
| Source node | Original URL/path, articles that cite it, ingestion date |

### Reading Mode

- Clicking an article node triggers a reading panel that slides up from the bottom (same pattern as current code viewer overlay)
- Shows the full compiled markdown rendered as HTML
- Includes a mini backlinks sidebar within the panel
- Clicking a `[[wikilink]]` or entity reference in the reading panel navigates the graph to that node

### Node Visual Styling

| Node Type | Shape | Color Accent |
|-----------|-------|-------------|
| `article` | Rounded rectangle | Warm amber |
| `entity` | Circle | Soft blue |
| `topic` | Large rounded rectangle | Muted gold |
| `claim` | Diamond | Green/red depending on contradictions |
| `source` | Small square | Gray |

### Edge Visual Styling

| Edge Type | Style |
|-----------|-------|
| `cites` | Dashed line |
| `contradicts` | Red line |
| `builds_on` | Solid with arrow |
| `categorized_under` | Thin gray |
| `authored_by` | Dotted blue |
| `exemplifies` | Dotted green |

---

## Skill Interface

### Usage

```bash
# Full scan — first time or rescan
/understand-knowledge

# Point at a specific directory
/understand-knowledge path/to/my-notes

# Incremental ingest — add new sources to existing graph
/understand-knowledge --ingest path/to/new-note.md
/understand-knowledge --ingest path/to/new-folder/
```

### Behavior

1. Auto-detects format (Obsidian, Logseq, Karpathy, etc.)
2. Announces: "Detected Obsidian vault with 342 notes. Scanning..."
3. Runs the agent pipeline (scanner → detector → analyzer → relationship-builder → reviewer)
4. Writes `knowledge-graph.json` to `.understand-anything/` with `"kind": "knowledge"`
5. Auto-triggers `/understand-dashboard` after completion

### File Structure

```
skills/understand-knowledge/
SKILL.md — skill entry point, orchestration logic
formats/
obsidian.md
logseq.md
dendron.md
foam.md
karpathy.md
zettelkasten.md
plain.md
```

### Coexistence with `/understand`

- `/understand` produces `"kind": "codebase"` graphs
- `/understand-knowledge` produces `"kind": "knowledge"` graphs
- Both write to `.understand-anything/knowledge-graph.json`
- Running one replaces the other
- To scope knowledge analysis to a subdirectory (e.g., `docs/` within a code repo), use `/understand-knowledge path/to/docs`

---

## What This Enables That Nothing Else Does

| Existing Tools | Limitation | Our Advantage |
|---------------|-----------|---------------|
| Obsidian graph view | Untyped edges — all links look the same | Typed edges: cites, contradicts, builds_on |
| Logseq graph | Only shows explicit links | LLM discovers implicit relationships |
| All PKM tools | Single-format only | Cross-format support with auto-detection |
| Karpathy LLM Wiki | Flat text wiki, no visualization | Interactive graph dashboard with guided tours |
| None | No knowledge graph tours | Tour mode walks through a knowledge base step by step |
Loading
Loading