diff --git a/docs/plans/2026-03-14-phase1-implementation.md b/docs/superpowers/plans/2026-03-14-phase1-implementation.md similarity index 100% rename from docs/plans/2026-03-14-phase1-implementation.md rename to docs/superpowers/plans/2026-03-14-phase1-implementation.md diff --git a/docs/plans/2026-03-14-phase2-implementation.md b/docs/superpowers/plans/2026-03-14-phase2-implementation.md similarity index 100% rename from docs/plans/2026-03-14-phase2-implementation.md rename to docs/superpowers/plans/2026-03-14-phase2-implementation.md diff --git a/docs/plans/2026-03-14-phase3-implementation.md b/docs/superpowers/plans/2026-03-14-phase3-implementation.md similarity index 100% rename from docs/plans/2026-03-14-phase3-implementation.md rename to docs/superpowers/plans/2026-03-14-phase3-implementation.md diff --git a/docs/plans/2026-03-14-phase4-implementation.md b/docs/superpowers/plans/2026-03-14-phase4-implementation.md similarity index 100% rename from docs/plans/2026-03-14-phase4-implementation.md rename to docs/superpowers/plans/2026-03-14-phase4-implementation.md diff --git a/docs/plans/2026-03-15-homepage-implementation.md b/docs/superpowers/plans/2026-03-15-homepage-implementation.md similarity index 100% rename from docs/plans/2026-03-15-homepage-implementation.md rename to docs/superpowers/plans/2026-03-15-homepage-implementation.md diff --git a/docs/plans/2026-03-18-multi-platform-simple-implementation.md b/docs/superpowers/plans/2026-03-18-multi-platform-simple-implementation.md similarity index 100% rename from docs/plans/2026-03-18-multi-platform-simple-implementation.md rename to docs/superpowers/plans/2026-03-18-multi-platform-simple-implementation.md diff --git a/docs/plans/2026-03-21-language-agnostic-plan.md b/docs/superpowers/plans/2026-03-21-language-agnostic-plan.md similarity index 100% rename from docs/plans/2026-03-21-language-agnostic-plan.md rename to docs/superpowers/plans/2026-03-21-language-agnostic-plan.md diff --git a/docs/plans/2026-03-25-dashboard-robustness-impl.md b/docs/superpowers/plans/2026-03-25-dashboard-robustness-impl.md similarity index 100% rename from docs/plans/2026-03-25-dashboard-robustness-impl.md rename to docs/superpowers/plans/2026-03-25-dashboard-robustness-impl.md diff --git a/docs/plans/2026-03-25-dashboard-robustness-plan.md b/docs/superpowers/plans/2026-03-25-dashboard-robustness-plan.md similarity index 100% rename from docs/plans/2026-03-25-dashboard-robustness-plan.md rename to docs/superpowers/plans/2026-03-25-dashboard-robustness-plan.md diff --git a/docs/plans/2026-03-26-theme-system-implementation.md b/docs/superpowers/plans/2026-03-26-theme-system-implementation.md similarity index 100% rename from docs/plans/2026-03-26-theme-system-implementation.md rename to docs/superpowers/plans/2026-03-26-theme-system-implementation.md diff --git a/docs/plans/2026-03-27-token-reduction-impl.md b/docs/superpowers/plans/2026-03-27-token-reduction-impl.md similarity index 100% rename from docs/plans/2026-03-27-token-reduction-impl.md rename to docs/superpowers/plans/2026-03-27-token-reduction-impl.md diff --git a/docs/plans/2026-03-28-understand-anything-extension-impl.md b/docs/superpowers/plans/2026-03-28-understand-anything-extension-impl.md similarity index 100% rename from docs/plans/2026-03-28-understand-anything-extension-impl.md rename to docs/superpowers/plans/2026-03-28-understand-anything-extension-impl.md diff --git a/docs/plans/2026-03-29-homepage-update-impl.md b/docs/superpowers/plans/2026-03-29-homepage-update-impl.md similarity index 100% rename from docs/plans/2026-03-29-homepage-update-impl.md rename to docs/superpowers/plans/2026-03-29-homepage-update-impl.md diff --git a/docs/plans/2026-04-01-business-domain-knowledge-impl.md b/docs/superpowers/plans/2026-04-01-business-domain-knowledge-impl.md similarity index 100% rename from docs/plans/2026-04-01-business-domain-knowledge-impl.md rename to docs/superpowers/plans/2026-04-01-business-domain-knowledge-impl.md diff --git a/docs/superpowers/plans/2026-04-09-understand-knowledge.md b/docs/superpowers/plans/2026-04-09-understand-knowledge.md new file mode 100644 index 0000000..4c0635f --- /dev/null +++ b/docs/superpowers/plans/2026-04-09-understand-knowledge.md @@ -0,0 +1,1740 @@ +# /understand-knowledge Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add a `/understand-knowledge` skill that takes any folder of markdown notes (Obsidian, Logseq, Dendron, Foam, Karpathy-style, Zettelkasten, or plain) and produces an interactive knowledge graph with typed nodes, edges, and dashboard visualization. + +**Architecture:** Extends the existing schema with 5 knowledge node types and 6 knowledge edge types. A new 5-agent pipeline (knowledge-scanner → format-detector → article-analyzer → relationship-builder → graph-reviewer) processes markdown files. The dashboard renders knowledge graphs with vertical layout, a knowledge-specific sidebar, and a reading mode panel — all driven by a new `kind` field on the root graph object. + +**Tech Stack:** TypeScript, Zod (schema validation), React + ReactFlow (dashboard), dagre (layout), TailwindCSS v4, Vitest (testing) + +**Spec:** `docs/superpowers/specs/2026-04-09-understand-knowledge-design.md` + +--- + +## File Structure + +### Core package changes +- Modify: `understand-anything-plugin/packages/core/src/types.ts` — add 5 node types, 6 edge types, `KnowledgeMeta` interface, `kind` field +- Modify: `understand-anything-plugin/packages/core/src/schema.ts` — add new types to Zod schemas, add aliases +- Modify: `understand-anything-plugin/packages/core/src/types.test.ts` — add tests for new types +- Test: `understand-anything-plugin/packages/core/src/__tests__/knowledge-schema.test.ts` — validation tests for knowledge-specific schema + +### Dashboard changes +- Modify: `understand-anything-plugin/packages/dashboard/src/store.ts` — add knowledge node types, edge categories, `ViewMode`, node categories +- Modify: `understand-anything-plugin/packages/dashboard/src/components/CustomNode.tsx` — add colors for 5 new node types +- Modify: `understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx` — add badge colors and edge labels for new types, add knowledge sidebar sections +- Modify: `understand-anything-plugin/packages/dashboard/src/components/ProjectOverview.tsx` — add knowledge-specific stats +- Modify: `understand-anything-plugin/packages/dashboard/src/index.css` — add CSS variables for 5 new node colors +- Modify: `understand-anything-plugin/packages/dashboard/src/App.tsx` — detect `kind` field, set view mode +- Create: `understand-anything-plugin/packages/dashboard/src/components/KnowledgeInfo.tsx` — knowledge-specific sidebar +- Create: `understand-anything-plugin/packages/dashboard/src/components/ReadingPanel.tsx` — full article reading overlay + +### Skill & agent definitions +- Create: `understand-anything-plugin/skills/understand-knowledge/SKILL.md` — skill entry point +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/obsidian.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/logseq.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/dendron.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/foam.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/karpathy.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/zettelkasten.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/plain.md` +- Create: `understand-anything-plugin/agents/knowledge-scanner.md` +- Create: `understand-anything-plugin/agents/format-detector.md` +- Create: `understand-anything-plugin/agents/article-analyzer.md` +- Create: `understand-anything-plugin/agents/relationship-builder.md` + +Existing `graph-reviewer.md` agent is reused for the final validation step. + +--- + +## Task 1: Extend Core Types + +**Files:** +- Modify: `understand-anything-plugin/packages/core/src/types.ts` + +- [ ] **Step 1: Add knowledge node types to NodeType union** + +In `understand-anything-plugin/packages/core/src/types.ts`, add the 5 knowledge types after the domain types: + +```typescript +// Node types (21 total: 5 code + 8 non-code + 3 domain + 5 knowledge) +export type NodeType = + | "file" | "function" | "class" | "module" | "concept" + | "config" | "document" | "service" | "table" | "endpoint" + | "pipeline" | "schema" | "resource" + | "domain" | "flow" | "step" + | "article" | "entity" | "topic" | "claim" | "source"; +``` + +- [ ] **Step 2: Add knowledge edge types to EdgeType union** + +```typescript +// Edge types (35 total in 8 categories) +export type EdgeType = + | "imports" | "exports" | "contains" | "inherits" | "implements" + | "calls" | "subscribes" | "publishes" | "middleware" + | "reads_from" | "writes_to" | "transforms" | "validates" + | "depends_on" | "tested_by" | "configures" + | "related" | "similar_to" + | "deploys" | "serves" | "provisions" | "triggers" + | "migrates" | "documents" | "routes" | "defines_schema" + | "contains_flow" | "flow_step" | "cross_domain" + | "cites" | "contradicts" | "builds_on" | "exemplifies" | "categorized_under" | "authored_by"; +``` + +- [ ] **Step 3: Add KnowledgeMeta interface** + +Add after the `DomainMeta` interface: + +```typescript +// Optional knowledge metadata for article/entity/topic/claim/source nodes +export interface KnowledgeMeta { + format?: "obsidian" | "logseq" | "dendron" | "foam" | "karpathy" | "zettelkasten" | "plain"; + wikilinks?: string[]; + backlinks?: string[]; + frontmatter?: Record; + sourceUrl?: string; + confidence?: number; // 0-1, for LLM-inferred relationships +} +``` + +- [ ] **Step 4: Add knowledgeMeta to GraphNode** + +```typescript +export interface GraphNode { + id: string; + type: NodeType; + name: string; + filePath?: string; + lineRange?: [number, number]; + summary: string; + tags: string[]; + complexity: "simple" | "moderate" | "complex"; + languageNotes?: string; + domainMeta?: DomainMeta; + knowledgeMeta?: KnowledgeMeta; +} +``` + +- [ ] **Step 5: Add kind field to KnowledgeGraph** + +```typescript +export interface KnowledgeGraph { + version: string; + kind?: "codebase" | "knowledge"; // undefined defaults to "codebase" for backward compat + project: ProjectMeta; + nodes: GraphNode[]; + edges: GraphEdge[]; + layers: Layer[]; + tour: TourStep[]; +} +``` + +- [ ] **Step 6: Build core and verify no type errors** + +Run: `pnpm --filter @understand-anything/core build` +Expected: Clean build, no errors + +- [ ] **Step 7: Commit** + +```bash +git add understand-anything-plugin/packages/core/src/types.ts +git commit -m "feat(core): add knowledge node types, edge types, KnowledgeMeta, and graph kind field" +``` + +--- + +## Task 2: Extend Schema Validation + +**Files:** +- Modify: `understand-anything-plugin/packages/core/src/schema.ts` +- Create: `understand-anything-plugin/packages/core/src/__tests__/knowledge-schema.test.ts` + +- [ ] **Step 1: Add knowledge edge types to EdgeTypeSchema** + +In `understand-anything-plugin/packages/core/src/schema.ts`, update the `EdgeTypeSchema` z.enum to include the 6 new types: + +```typescript +export const EdgeTypeSchema = z.enum([ + "imports", "exports", "contains", "inherits", "implements", + "calls", "subscribes", "publishes", "middleware", + "reads_from", "writes_to", "transforms", "validates", + "depends_on", "tested_by", "configures", + "related", "similar_to", + "deploys", "serves", "provisions", "triggers", + "migrates", "documents", "routes", "defines_schema", + "contains_flow", "flow_step", "cross_domain", + // Knowledge + "cites", "contradicts", "builds_on", "exemplifies", "categorized_under", "authored_by", +]); +``` + +- [ ] **Step 2: Add knowledge node type aliases** + +Add to `NODE_TYPE_ALIASES`: + +```typescript + // Knowledge aliases + note: "article", + page: "article", + wiki_page: "article", + person: "entity", + tool: "entity", + paper: "entity", + organization: "entity", + org: "entity", + category: "topic", + theme: "topic", + tag_topic: "topic", + assertion: "claim", + insight: "claim", + takeaway: "claim", + reference: "source", + raw: "source", + citation: "source", +``` + +- [ ] **Step 3: Add knowledge edge type aliases** + +Add to `EDGE_TYPE_ALIASES`: + +```typescript + // Knowledge aliases + references: "cites", + cited_by: "cites", + sourced_from: "cites", + conflicts_with: "contradicts", + disagrees_with: "contradicts", + extends: "builds_on", // Note: "extends" was already mapped to "inherits" — knowledge context will use builds_on via the relationship-builder agent prompt, so keep "extends" → "inherits" for code + refines: "builds_on", + deepens: "builds_on", + example_of: "exemplifies", + instance_of: "exemplifies", + belongs_to: "categorized_under", + tagged_with: "categorized_under", + part_of: "categorized_under", + written_by: "authored_by", + created_by: "authored_by", +``` + +- [ ] **Step 4: Write the failing test for knowledge graph validation** + +Create `understand-anything-plugin/packages/core/src/__tests__/knowledge-schema.test.ts`: + +```typescript +import { describe, it, expect } from "vitest"; +import { validateGraph } from "../schema"; +import type { KnowledgeGraph } from "../types"; + +describe("knowledge graph schema validation", () => { + const minimalKnowledgeGraph: KnowledgeGraph = { + version: "1.0", + kind: "knowledge", + project: { + name: "Test KB", + languages: [], + frameworks: [], + description: "A test knowledge base", + analyzedAt: new Date().toISOString(), + gitCommitHash: "abc123", + }, + nodes: [ + { + id: "article:test-note", + type: "article", + name: "Test Note", + summary: "A test article node", + tags: ["test"], + complexity: "simple", + }, + { + id: "entity:karpathy", + type: "entity", + name: "Andrej Karpathy", + summary: "AI researcher", + tags: ["person", "ai"], + complexity: "simple", + }, + { + id: "topic:pkm", + type: "topic", + name: "Personal Knowledge Management", + summary: "Tools and methods for managing personal knowledge", + tags: ["knowledge", "productivity"], + complexity: "moderate", + }, + ], + edges: [ + { + source: "article:test-note", + target: "entity:karpathy", + type: "authored_by", + direction: "forward", + weight: 0.8, + }, + { + source: "article:test-note", + target: "topic:pkm", + type: "categorized_under", + direction: "forward", + weight: 0.7, + }, + ], + layers: [ + { + id: "layer:pkm", + name: "PKM", + description: "Personal Knowledge Management topic cluster", + nodeIds: ["article:test-note", "topic:pkm"], + }, + ], + tour: [], + }; + + it("validates a minimal knowledge graph", () => { + const result = validateGraph(minimalKnowledgeGraph); + const fatals = result.issues.filter((i) => i.level === "fatal"); + expect(fatals).toHaveLength(0); + }); + + it("accepts all knowledge node types", () => { + const graph = { + ...minimalKnowledgeGraph, + nodes: [ + ...minimalKnowledgeGraph.nodes, + { id: "claim:rag-bad", type: "claim", name: "RAG loses context", summary: "An assertion", tags: ["claim"], complexity: "simple" }, + { id: "source:paper1", type: "source", name: "Attention paper", summary: "A source", tags: ["paper"], complexity: "simple" }, + ], + }; + const result = validateGraph(graph); + const fatals = result.issues.filter((i) => i.level === "fatal"); + expect(fatals).toHaveLength(0); + }); + + it("accepts all knowledge edge types", () => { + const graph = { + ...minimalKnowledgeGraph, + nodes: [ + ...minimalKnowledgeGraph.nodes, + { id: "claim:c1", type: "claim", name: "Claim 1", summary: "c1", tags: [], complexity: "simple" }, + { id: "claim:c2", type: "claim", name: "Claim 2", summary: "c2", tags: [], complexity: "simple" }, + { id: "source:s1", type: "source", name: "Source 1", summary: "s1", tags: [], complexity: "simple" }, + { id: "article:a2", type: "article", name: "Article 2", summary: "a2", tags: [], complexity: "simple" }, + ], + edges: [ + ...minimalKnowledgeGraph.edges, + { source: "article:test-note", target: "source:s1", type: "cites", direction: "forward", weight: 0.7 }, + { source: "claim:c1", target: "claim:c2", type: "contradicts", direction: "forward", weight: 0.6 }, + { source: "article:a2", target: "article:test-note", type: "builds_on", direction: "forward", weight: 0.7 }, + { source: "entity:karpathy", target: "topic:pkm", type: "exemplifies", direction: "forward", weight: 0.5 }, + ], + }; + const result = validateGraph(graph); + const fatals = result.issues.filter((i) => i.level === "fatal"); + expect(fatals).toHaveLength(0); + }); + + it("resolves knowledge node type aliases", () => { + const graph = { + ...minimalKnowledgeGraph, + nodes: [ + { id: "note:n1", type: "note", name: "A Note", summary: "note alias", tags: [], complexity: "simple" }, + { id: "person:p1", type: "person", name: "A Person", summary: "person alias", tags: [], complexity: "simple" }, + ], + edges: [], + layers: [], + }; + const result = validateGraph(graph); + const noteNode = result.graph.nodes.find((n) => n.id === "note:n1"); + const personNode = result.graph.nodes.find((n) => n.id === "person:p1"); + expect(noteNode?.type).toBe("article"); + expect(personNode?.type).toBe("entity"); + }); + + it("resolves knowledge edge type aliases", () => { + const graph = { + ...minimalKnowledgeGraph, + edges: [ + { source: "article:test-note", target: "entity:karpathy", type: "written_by", direction: "forward", weight: 0.8 }, + ], + }; + const result = validateGraph(graph); + const edge = result.graph.edges.find((e) => e.source === "article:test-note" && e.target === "entity:karpathy"); + expect(edge?.type).toBe("authored_by"); + }); +}); +``` + +- [ ] **Step 5: Run tests to verify they fail** + +Run: `pnpm --filter @understand-anything/core test -- --run src/__tests__/knowledge-schema.test.ts` +Expected: Tests fail because EdgeTypeSchema doesn't include knowledge types yet (if schema.ts wasn't updated), or pass if Steps 1-3 were done correctly. + +- [ ] **Step 6: Run all core tests to verify nothing is broken** + +Run: `pnpm --filter @understand-anything/core test -- --run` +Expected: All existing tests pass, new knowledge tests pass + +- [ ] **Step 7: Commit** + +```bash +git add understand-anything-plugin/packages/core/src/schema.ts understand-anything-plugin/packages/core/src/__tests__/knowledge-schema.test.ts +git commit -m "feat(core): add knowledge types to schema validation with aliases and tests" +``` + +--- + +## Task 3: Dashboard — CSS Variables & Node Colors + +**Files:** +- Modify: `understand-anything-plugin/packages/dashboard/src/index.css` + +- [ ] **Step 1: Add CSS variables for 5 knowledge node types** + +In `understand-anything-plugin/packages/dashboard/src/index.css`, add after the existing `--color-node-resource` line: + +```css + /* Knowledge node colors */ + --color-node-article: #d4a574; /* warm amber */ + --color-node-entity: #7ba4c9; /* soft blue */ + --color-node-topic: #c9b06c; /* muted gold */ + --color-node-claim: #6fb07a; /* soft green */ + --color-node-source: #8a8a8a; /* gray */ +``` + +- [ ] **Step 2: Add Tailwind text-color utilities for knowledge nodes** + +Verify TailwindCSS v4 picks up the CSS variables automatically. If the existing pattern uses `text-node-*` classes defined elsewhere, add matching entries. Check if there's a Tailwind config or if the CSS variables are consumed directly. + +Look at how existing `text-node-file` etc. are defined — if they're in the CSS file as utility classes, add: + +```css + .text-node-article { color: var(--color-node-article); } + .text-node-entity { color: var(--color-node-entity); } + .text-node-topic { color: var(--color-node-topic); } + .text-node-claim { color: var(--color-node-claim); } + .text-node-source { color: var(--color-node-source); } +``` + +And corresponding `border-node-*` and `bg-node-*` variants if the pattern requires them. + +- [ ] **Step 3: Commit** + +```bash +git add understand-anything-plugin/packages/dashboard/src/index.css +git commit -m "feat(dashboard): add CSS variables and utility classes for knowledge node types" +``` + +--- + +## Task 4: Dashboard — Store & Type Maps + +**Files:** +- Modify: `understand-anything-plugin/packages/dashboard/src/store.ts` + +- [ ] **Step 1: Add knowledge types to NodeType union** + +Update the local `NodeType` in store.ts: + +```typescript +export type NodeType = "file" | "function" | "class" | "module" | "concept" | "config" | "document" | "service" | "table" | "endpoint" | "pipeline" | "schema" | "resource" | "domain" | "flow" | "step" | "article" | "entity" | "topic" | "claim" | "source"; +``` + +- [ ] **Step 2: Add knowledge edge category** + +Update `EdgeCategory` and `EDGE_CATEGORY_MAP`: + +```typescript +export type EdgeCategory = "structural" | "behavioral" | "data-flow" | "dependencies" | "semantic" | "infrastructure" | "domain" | "knowledge"; + +export const EDGE_CATEGORY_MAP: Record = { + structural: ["imports", "exports", "contains", "inherits", "implements"], + behavioral: ["calls", "subscribes", "publishes", "middleware"], + "data-flow": ["reads_from", "writes_to", "transforms", "validates"], + dependencies: ["depends_on", "tested_by", "configures"], + semantic: ["related", "similar_to"], + infrastructure: ["deploys", "serves", "provisions", "triggers"], + domain: ["contains_flow", "flow_step", "cross_domain"], + knowledge: ["cites", "contradicts", "builds_on", "exemplifies", "categorized_under", "authored_by"], +}; +``` + +- [ ] **Step 3: Add knowledge to ALL_NODE_TYPES and ALL_EDGE_CATEGORIES** + +```typescript +export const ALL_NODE_TYPES: NodeType[] = ["file", "function", "class", "module", "concept", "config", "document", "service", "table", "endpoint", "pipeline", "schema", "resource", "domain", "flow", "step", "article", "entity", "topic", "claim", "source"]; + +export const ALL_EDGE_CATEGORIES: EdgeCategory[] = ["structural", "behavioral", "data-flow", "dependencies", "semantic", "infrastructure", "domain", "knowledge"]; +``` + +- [ ] **Step 4: Add "knowledge" to ViewMode and NodeCategory** + +```typescript +export type ViewMode = "structural" | "domain" | "knowledge"; + +export type NodeCategory = "code" | "config" | "docs" | "infra" | "data" | "domain" | "knowledge"; +``` + +Update the `NODE_CATEGORY_MAP` (find where it maps node types to categories) to include: + +```typescript + article: "knowledge", + entity: "knowledge", + topic: "knowledge", + claim: "knowledge", + source: "knowledge", +``` + +- [ ] **Step 5: Add knowledge node type filter default** + +In the store's initial state `nodeTypeFilters`, add: + +```typescript +nodeTypeFilters: { code: true, config: true, docs: true, infra: true, data: true, domain: true, knowledge: true }, +``` + +- [ ] **Step 6: Build dashboard and verify no errors** + +Run: `pnpm --filter @understand-anything/dashboard build` +Expected: Clean build + +- [ ] **Step 7: Commit** + +```bash +git add understand-anything-plugin/packages/dashboard/src/store.ts +git commit -m "feat(dashboard): add knowledge types to store, edge categories, and view mode" +``` + +--- + +## Task 5: Dashboard — CustomNode & NodeInfo Type Maps + +**Files:** +- Modify: `understand-anything-plugin/packages/dashboard/src/components/CustomNode.tsx` +- Modify: `understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx` + +- [ ] **Step 1: Add knowledge node colors to CustomNode.tsx** + +In `typeColors` map, add after the `step` entry: + +```typescript + // Knowledge + article: "var(--color-node-article)", + entity: "var(--color-node-entity)", + topic: "var(--color-node-topic)", + claim: "var(--color-node-claim)", + source: "var(--color-node-source)", +``` + +In `typeTextColors` map, add: + +```typescript + // Knowledge + article: "text-node-article", + entity: "text-node-entity", + topic: "text-node-topic", + claim: "text-node-claim", + source: "text-node-source", +``` + +- [ ] **Step 2: Add knowledge node badge colors to NodeInfo.tsx** + +In `typeBadgeColors` map, add: + +```typescript + // Knowledge + article: "text-node-article border border-node-article/30 bg-node-article/10", + entity: "text-node-entity border border-node-entity/30 bg-node-entity/10", + topic: "text-node-topic border border-node-topic/30 bg-node-topic/10", + claim: "text-node-claim border border-node-claim/30 bg-node-claim/10", + source: "text-node-source border border-node-source/30 bg-node-source/10", +``` + +- [ ] **Step 3: Add knowledge edge labels to NodeInfo.tsx** + +In `EDGE_LABELS` map, add: + +```typescript + // Knowledge + cites: { forward: "cites", backward: "cited by" }, + contradicts: { forward: "contradicts", backward: "contradicted by" }, + builds_on: { forward: "builds on", backward: "built upon by" }, + exemplifies: { forward: "exemplifies", backward: "exemplified by" }, + categorized_under: { forward: "categorized under", backward: "categorizes" }, + authored_by: { forward: "authored by", backward: "authored" }, +``` + +- [ ] **Step 4: Build dashboard and verify** + +Run: `pnpm --filter @understand-anything/dashboard build` +Expected: Clean build, no type errors + +- [ ] **Step 5: Commit** + +```bash +git add understand-anything-plugin/packages/dashboard/src/components/CustomNode.tsx understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx +git commit -m "feat(dashboard): add knowledge node colors, badge colors, and edge labels" +``` + +--- + +## Task 6: Dashboard — Knowledge Sidebar Component + +**Files:** +- Create: `understand-anything-plugin/packages/dashboard/src/components/KnowledgeInfo.tsx` +- Modify: `understand-anything-plugin/packages/dashboard/src/App.tsx` + +- [ ] **Step 1: Create KnowledgeInfo.tsx** + +Create `understand-anything-plugin/packages/dashboard/src/components/KnowledgeInfo.tsx`: + +```tsx +import { useDashboardStore } from "../store"; +import type { GraphNode, GraphEdge, KnowledgeGraph } from "@understand-anything/core/types"; + +const KNOWLEDGE_NODE_TYPES = new Set(["article", "entity", "topic", "claim", "source"]); + +function getBacklinks(nodeId: string, edges: GraphEdge[]): string[] { + return edges + .filter((e) => e.target === nodeId) + .map((e) => e.source); +} + +function getOutgoingLinks(nodeId: string, edges: GraphEdge[]): string[] { + return edges + .filter((e) => e.source === nodeId) + .map((e) => e.target); +} + +function NodeLink({ nodeId, nodes, onNavigate }: { nodeId: string; nodes: GraphNode[]; onNavigate: (id: string) => void }) { + const node = nodes.find((n) => n.id === nodeId); + if (!node) return {nodeId}; + return ( + + ); +} + +export default function KnowledgeInfo() { + const graph = useDashboardStore((s) => s.graph); + const selectedNode = useDashboardStore((s) => s.selectedNode); + const setSelectedNode = useDashboardStore((s) => s.setSelectedNode); + + if (!graph || !selectedNode) return null; + + const node = graph.nodes.find((n) => n.id === selectedNode); + if (!node) return null; + + const backlinks = getBacklinks(node.id, graph.edges); + const outgoing = getOutgoingLinks(node.id, graph.edges); + const meta = node.knowledgeMeta; + + return ( +
+ {/* Header */} +
+
{node.type}
+

{node.name}

+
+ + {/* Summary */} +

{node.summary}

+ + {/* Tags */} + {node.tags.length > 0 && ( +
+ {node.tags.map((tag) => ( + + {tag} + + ))} +
+ )} + + {/* Knowledge-specific metadata */} + {meta?.sourceUrl && ( +
+
Source
+ {meta.sourceUrl} +
+ )} + + {meta?.confidence !== undefined && ( +
+
Confidence
+
+
+
+
+ {Math.round(meta.confidence * 100)}% +
+
+ )} + + {/* Frontmatter */} + {meta?.frontmatter && Object.keys(meta.frontmatter).length > 0 && ( +
+
Frontmatter
+
+ {Object.entries(meta.frontmatter).map(([key, value]) => ( +
+ {key}:{" "} + {String(value)} +
+ ))} +
+
+ )} + + {/* Backlinks */} + {backlinks.length > 0 && ( +
+
+ Backlinks ({backlinks.length}) +
+
+ {backlinks.map((id) => ( + + ))} +
+
+ )} + + {/* Outgoing */} + {outgoing.length > 0 && ( +
+
+ Outgoing Links ({outgoing.length}) +
+
+ {outgoing.map((id) => ( + + ))} +
+
+ )} +
+ ); +} +``` + +- [ ] **Step 2: Integrate KnowledgeInfo into App.tsx sidebar rendering** + +In `understand-anything-plugin/packages/dashboard/src/App.tsx`, find where the sidebar renders `NodeInfo` and add a condition: if `graph.kind === "knowledge"` and a node is selected, render `KnowledgeInfo` instead of `NodeInfo`. + +Import at top: +```typescript +import KnowledgeInfo from "./components/KnowledgeInfo"; +``` + +In the sidebar section, wrap the existing NodeInfo render: +```tsx +{graph?.kind === "knowledge" ? : } +``` + +- [ ] **Step 3: Build dashboard and verify** + +Run: `pnpm --filter @understand-anything/dashboard build` +Expected: Clean build + +- [ ] **Step 4: Commit** + +```bash +git add understand-anything-plugin/packages/dashboard/src/components/KnowledgeInfo.tsx understand-anything-plugin/packages/dashboard/src/App.tsx +git commit -m "feat(dashboard): add KnowledgeInfo sidebar component for knowledge graphs" +``` + +--- + +## Task 7: Dashboard — Reading Panel + +**Files:** +- Create: `understand-anything-plugin/packages/dashboard/src/components/ReadingPanel.tsx` +- Modify: `understand-anything-plugin/packages/dashboard/src/App.tsx` + +- [ ] **Step 1: Create ReadingPanel.tsx** + +Create `understand-anything-plugin/packages/dashboard/src/components/ReadingPanel.tsx`: + +```tsx +import { useState } from "react"; +import { useDashboardStore } from "../store"; + +export default function ReadingPanel() { + const graph = useDashboardStore((s) => s.graph); + const selectedNode = useDashboardStore((s) => s.selectedNode); + const setSelectedNode = useDashboardStore((s) => s.setSelectedNode); + const [isExpanded, setIsExpanded] = useState(false); + + if (!graph || graph.kind !== "knowledge" || !selectedNode) return null; + + const node = graph.nodes.find((n) => n.id === selectedNode); + if (!node || node.type !== "article") return null; + + // Get backlinks for this article + const backlinks = graph.edges + .filter((e) => e.target === node.id) + .map((e) => { + const sourceNode = graph.nodes.find((n) => n.id === e.source); + return sourceNode ? { id: sourceNode.id, name: sourceNode.name, type: sourceNode.type } : null; + }) + .filter(Boolean) as { id: string; name: string; type: string }[]; + + return ( +
+ {/* Header bar */} +
+
+ Reading + {node.name} +
+
+ + +
+
+ +
+ {/* Main content */} +
+
+

{node.name}

+ + {/* Tags */} + {node.tags.length > 0 && ( +
+ {node.tags.map((tag) => ( + + {tag} + + ))} +
+ )} + + {/* Article content (summary for now — full markdown rendering is a future enhancement) */} +
+

{node.summary}

+
+ + {/* Frontmatter metadata */} + {node.knowledgeMeta?.frontmatter && Object.keys(node.knowledgeMeta.frontmatter).length > 0 && ( +
+
Metadata
+ {Object.entries(node.knowledgeMeta.frontmatter).map(([key, value]) => ( +
+ {key}:{" "} + {String(value)} +
+ ))} +
+ )} +
+
+ + {/* Backlinks sidebar */} + {backlinks.length > 0 && ( +
+
+ Backlinks ({backlinks.length}) +
+
+ {backlinks.map((link) => ( + + ))} +
+
+ )} +
+
+ ); +} +``` + +- [ ] **Step 2: Add ReadingPanel to App.tsx** + +Import and render `ReadingPanel` in the main dashboard layout, positioned at the bottom: + +```typescript +import ReadingPanel from "./components/ReadingPanel"; +``` + +Add `` inside the dashboard container, after the graph view area. + +- [ ] **Step 3: Build and verify** + +Run: `pnpm --filter @understand-anything/dashboard build` +Expected: Clean build + +- [ ] **Step 4: Commit** + +```bash +git add understand-anything-plugin/packages/dashboard/src/components/ReadingPanel.tsx understand-anything-plugin/packages/dashboard/src/App.tsx +git commit -m "feat(dashboard): add ReadingPanel for article reading mode in knowledge graphs" +``` + +--- + +## Task 8: Dashboard — Vertical Layout for Knowledge Graphs + +**Files:** +- Modify: `understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx` +- Modify: `understand-anything-plugin/packages/dashboard/src/utils/layout.ts` (if direction isn't already configurable) + +- [ ] **Step 1: Check how layout direction is passed to dagre** + +Read `GraphView.tsx` to find where `applyDagreLayout` is called. The layout.ts `applyDagreLayout` already accepts a `direction: "TB" | "LR"` parameter (default `"TB"`). + +Find where GraphView calls this function and check what direction it passes. + +- [ ] **Step 2: Pass graph kind to layout decision** + +In `GraphView.tsx`, where the layout is applied, check the graph's `kind` field. If `kind === "knowledge"`, use `"TB"` (top-to-bottom). If `kind === "codebase"` or undefined, keep the existing default. + +The graph object is available via the store. Add: + +```typescript +const graphKind = useDashboardStore((s) => s.graph?.kind); +const layoutDirection = graphKind === "knowledge" ? "TB" : "LR"; +``` + +Pass `layoutDirection` to the layout call. + +- [ ] **Step 3: Build and verify** + +Run: `pnpm --filter @understand-anything/dashboard build` +Expected: Clean build + +- [ ] **Step 4: Commit** + +```bash +git add understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx +git commit -m "feat(dashboard): use vertical top-down layout for knowledge graphs" +``` + +--- + +## Task 9: Dashboard — Knowledge Edge Styling + +**Files:** +- Modify: `understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx` + +- [ ] **Step 1: Add knowledge edge style map** + +In `GraphView.tsx`, add a style map for knowledge edge types. Follow the existing pattern from `DomainGraphView.tsx` which uses ReactFlow's `style` prop: + +```typescript +const KNOWLEDGE_EDGE_STYLES: Record = { + cites: { strokeDasharray: "6 3", strokeWidth: 1.5 }, + contradicts: { stroke: "#c97070", strokeWidth: 2 }, + builds_on: { stroke: "var(--color-accent)", strokeWidth: 2 }, + categorized_under: { stroke: "rgba(150,150,150,0.5)", strokeWidth: 1 }, + authored_by: { strokeDasharray: "3 3", stroke: "var(--color-node-entity)", strokeWidth: 1.5 }, + exemplifies: { strokeDasharray: "3 3", stroke: "var(--color-node-claim)", strokeWidth: 1.5 }, +}; +``` + +- [ ] **Step 2: Apply styles when building ReactFlow edges** + +Where edges are converted to ReactFlow format, check if the graph is `kind === "knowledge"` and the edge type has a knowledge style. Merge the style: + +```typescript +const knowledgeStyle = graph?.kind === "knowledge" ? KNOWLEDGE_EDGE_STYLES[edge.type] : undefined; +// Merge with existing edge style +const style = { ...baseEdgeStyle, ...knowledgeStyle }; +``` + +- [ ] **Step 3: Build and verify** + +Run: `pnpm --filter @understand-anything/dashboard build` +Expected: Clean build + +- [ ] **Step 4: Commit** + +```bash +git add understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx +git commit -m "feat(dashboard): add distinct edge styles for knowledge relationship types" +``` + +--- + +## Task 10: Dashboard — Knowledge-Aware ProjectOverview + +**Files:** +- Modify: `understand-anything-plugin/packages/dashboard/src/components/ProjectOverview.tsx` + +- [ ] **Step 1: Add knowledge-specific stats** + +In `ProjectOverview.tsx`, detect `graph.kind === "knowledge"` and show knowledge-specific stats: + +- Total articles, entities, topics, claims, sources (instead of "code, config, docs, infra, data") +- Detected format (from the first node's `knowledgeMeta.format`) +- Remove "Languages" and "Frameworks" sections for knowledge graphs (they'll be empty) + +Add after the existing stats grid: + +```tsx +{graph.kind === "knowledge" && ( +
+
Knowledge Stats
+
+ n.type === "article").length} /> + n.type === "entity").length} /> + n.type === "topic").length} /> + n.type === "claim").length} /> + n.type === "source").length} /> +
+
+)} +``` + +Reuse or create a `StatBox` component matching the existing style. + +- [ ] **Step 2: Conditionally hide code-specific sections** + +Wrap the "Languages", "Frameworks", and code-specific file type breakdown sections in a condition: + +```tsx +{graph.kind !== "knowledge" && ( + <> + {/* existing languages/frameworks/file-types sections */} + +)} +``` + +- [ ] **Step 3: Build and verify** + +Run: `pnpm --filter @understand-anything/dashboard build` +Expected: Clean build + +- [ ] **Step 4: Commit** + +```bash +git add understand-anything-plugin/packages/dashboard/src/components/ProjectOverview.tsx +git commit -m "feat(dashboard): add knowledge-specific stats to ProjectOverview" +``` + +--- + +## Task 11: Create Agent Definitions + +**Files:** +- Create: `understand-anything-plugin/agents/knowledge-scanner.md` +- Create: `understand-anything-plugin/agents/format-detector.md` +- Create: `understand-anything-plugin/agents/article-analyzer.md` +- Create: `understand-anything-plugin/agents/relationship-builder.md` + +- [ ] **Step 1: Create knowledge-scanner agent** + +Create `understand-anything-plugin/agents/knowledge-scanner.md`: + +```markdown +--- +name: knowledge-scanner +description: Scans a directory for markdown files and produces a file manifest for knowledge base analysis +model: inherit +--- + +# Knowledge Scanner Agent + +You scan a target directory to discover all markdown files for knowledge base analysis. + +## Input + +You receive a JSON block with: +- `targetDir` — absolute path to the knowledge base directory + +## Task + +1. Use Glob/Bash to find all `.md` files in the target directory (recursive) +2. Exclude common non-content directories: `.obsidian/`, `logseq/`, `.foam/`, `_meta/`, `node_modules/`, `.git/` +3. For each file, capture: + - `path` — relative path from targetDir + - `sizeLines` — number of lines + - `preview` — first 20 lines of content +4. Detect directory structure signatures: + - Check for `.obsidian/` directory + - Check for `logseq/` + `pages/` directories + - Check for `.dendron.yml` or `*.schema.yml` + - Check for `.foam/` or `.vscode/foam.json` + - Check for `raw/` + `wiki/` + `index.md` + - Scan a sample of files for `[[wikilinks]]` and unique ID prefixes +5. Write results to `$PROJECT_ROOT/.understand-anything/intermediate/knowledge-manifest.json` + +## Output Schema + +```json +{ + "targetDir": "/absolute/path", + "totalFiles": 342, + "directorySignatures": { + "hasObsidianDir": true, + "hasLogseqDir": false, + "hasDendronConfig": false, + "hasFoamConfig": false, + "hasKarpathyStructure": false, + "hasWikilinks": true, + "hasUniqueIdPrefixes": false + }, + "files": [ + { + "path": "notes/topic.md", + "sizeLines": 45, + "preview": "---\ntags: [ai, ml]\n---\n# Topic Name\n..." + } + ] +} +``` + +## Rules + +- Do NOT read file contents beyond the 20-line preview +- Sort files by path alphabetically +- Report total count prominently +- Write output to `.understand-anything/intermediate/knowledge-manifest.json` +``` + +- [ ] **Step 2: Create format-detector agent** + +Create `understand-anything-plugin/agents/format-detector.md`: + +```markdown +--- +name: format-detector +description: Detects the knowledge base format from directory signatures and file samples +model: inherit +--- + +# Format Detector Agent + +You analyze the knowledge-manifest.json to determine which knowledge base format is being used. + +## Input + +Read `.understand-anything/intermediate/knowledge-manifest.json` produced by the knowledge-scanner. + +## Detection Priority + +Apply these rules in order (first match wins): + +| Priority | Signal | Format | +|----------|--------|--------| +| 1 | `hasObsidianDir === true` | `obsidian` | +| 2 | `hasLogseqDir === true` | `logseq` | +| 3 | `hasDendronConfig === true` | `dendron` | +| 4 | `hasFoamConfig === true` | `foam` | +| 5 | `hasKarpathyStructure === true` | `karpathy` | +| 6 | `hasWikilinks === true` AND `hasUniqueIdPrefixes === true` | `zettelkasten` | +| 7 | fallback | `plain` | + +## Output + +Write to `.understand-anything/intermediate/format-detection.json`: + +```json +{ + "format": "obsidian", + "confidence": 0.95, + "parsingHints": { + "linkStyle": "wikilink", + "metadataLocation": "yaml-frontmatter", + "folderSemantics": "none", + "specialFiles": [".obsidian/app.json"], + "tagSyntax": "hashtag-inline" + } +} +``` + +## Rules + +- Always produce exactly one format +- Set confidence based on how many signals matched +- Include parsing hints that will help the article-analyzer +``` + +- [ ] **Step 3: Create article-analyzer agent** + +Create `understand-anything-plugin/agents/article-analyzer.md`: + +```markdown +--- +name: article-analyzer +description: Analyzes individual markdown files to extract knowledge nodes and explicit edges +model: inherit +--- + +# Article Analyzer Agent + +You analyze batches of markdown files from a knowledge base to extract structured knowledge graph data. + +## Input + +You receive a JSON block with: +- `projectRoot` — absolute path to the knowledge base +- `batchFiles` — array of file objects from the manifest (path, sizeLines, preview) +- `format` — detected format from format-detection.json +- `parsingHints` — format-specific parsing guidance + +You also receive a **format guide** (injected by the skill) that describes how to parse this specific format. + +## Task + +For each file in the batch: + +### 1. Read the full file content + +### 2. Extract the article node + +- **id**: `article:` (e.g., `article:notes/topic`) +- **type**: `article` +- **name**: First heading, or frontmatter title, or filename +- **filePath**: relative path +- **summary**: 2-3 sentence summary of the article content +- **tags**: from frontmatter tags, inline #tags, or inferred from content (3-5 tags) +- **complexity**: `simple` (<50 lines), `moderate` (50-200 lines), `complex` (>200 lines) +- **knowledgeMeta**: `{ format, wikilinks, frontmatter }` + +### 3. Extract entity nodes + +Identify named entities mentioned in the article: +- People, organizations, tools, papers, projects, datasets +- **id**: `entity:` (e.g., `entity:andrej-karpathy`) +- **type**: `entity` +- **summary**: one-sentence description based on context in the article +- **tags**: entity category tags like `person`, `tool`, `paper`, `organization` + +### 4. Extract claim nodes (for articles with strong assertions) + +- Only extract claims that are significant takeaways or insights +- **id**: `claim::` (e.g., `claim:notes/topic:rag-loses-context`) +- **type**: `claim` +- **summary**: the assertion itself + +### 5. Extract source nodes (for cited references) + +- External URLs, paper references, book citations +- **id**: `source:` +- **type**: `source` +- **knowledgeMeta**: `{ sourceUrl }` + +### 6. Extract explicit edges + +- `[[wikilinks]]` → find target article, create `related` edge +- Frontmatter references → `categorized_under` or `related` edges +- Inline citations/URLs → `cites` edges to source nodes +- Author mentions → `authored_by` edges + +## Node ID Conventions + +``` +article: +entity: +topic: +claim:: +source: +``` + +Normalize: lowercase, replace spaces with hyphens, remove special characters. + +**Deduplicate entities**: If the same entity appears across multiple files in the batch, emit it only once. Use the most informative summary. + +## Edge Weight Conventions + +``` +contains: 1.0 +authored_by: 0.9 +cites: 0.8 +categorized_under: 0.7 +builds_on: 0.7 +related: 0.5 +exemplifies: 0.5 +contradicts: 0.6 +``` + +## Output + +Write per-batch results to `.understand-anything/intermediate/article-batch-.json`: + +```json +{ + "nodes": [...], + "edges": [...] +} +``` + +## Rules + +- One article node per file (always) +- Entity nodes only for clearly named entities (not generic concepts) +- Claim nodes only for significant assertions (not every sentence) +- Source nodes only for explicit external references +- Deduplicate entities within the batch +- Respect the format guide for parsing links and metadata +``` + +- [ ] **Step 4: Create relationship-builder agent** + +Create `understand-anything-plugin/agents/relationship-builder.md`: + +```markdown +--- +name: relationship-builder +description: Discovers implicit cross-file relationships and builds topic clusters from analyzed knowledge nodes +model: inherit +--- + +# Relationship Builder Agent + +You analyze all extracted nodes and edges to discover implicit relationships that explicit links missed. + +## Input + +Read all `article-batch-*.json` files from `.understand-anything/intermediate/`. Merge all nodes and edges. + +## Task + +### 1. Deduplicate entities globally + +Multiple batches may have emitted the same entity. Merge them: +- Keep the most detailed summary +- Union all tags +- Collapse duplicate IDs + +### 2. Discover implicit relationships + +For each pair of articles/entities, determine if there's an implicit relationship: + +- **builds_on**: Article A extends or deepens ideas from Article B (similar topics, references same entities, but goes further) +- **contradicts**: Article A makes claims that conflict with Article B +- **categorized_under**: Group articles into topic clusters +- **exemplifies**: An entity is a concrete example of a concept/topic +- **related**: Articles share significant thematic overlap but aren't explicitly linked + +Set `confidence` in knowledgeMeta for LLM-inferred edges (0.0-1.0). + +### 3. Build topic nodes + +Identify thematic clusters across all articles: +- **id**: `topic:` +- **type**: `topic` +- **summary**: description of what this topic covers +- Create `categorized_under` edges from articles/entities to their topics + +### 4. Build layers + +Group nodes into layers by topic: +- Each topic becomes a layer +- Articles, entities, claims, and sources are assigned to their primary topic's layer +- Nodes not clearly belonging to any topic go into an "Uncategorized" layer + +### 5. Build tour + +Create a guided tour through the knowledge base: +- Start with the broadest topic overview +- Walk through key articles in a logical learning order +- Each step covers 1-3 related nodes +- 5-10 tour steps total + +## Output + +Write to `.understand-anything/intermediate/relationships.json`: + +```json +{ + "nodes": [...], + "edges": [...], + "layers": [...], + "tour": [...] +} +``` + +## Rules + +- Only add edges with confidence > 0.4 +- Don't duplicate edges that already exist from article-analyzer +- Topics should be meaningful clusters (3+ articles), not one-off categories +- Tour should be navigable by someone new to the knowledge base +- Keep layers balanced — no layer with 50%+ of all nodes +``` + +- [ ] **Step 5: Commit** + +```bash +git add understand-anything-plugin/agents/knowledge-scanner.md understand-anything-plugin/agents/format-detector.md understand-anything-plugin/agents/article-analyzer.md understand-anything-plugin/agents/relationship-builder.md +git commit -m "feat(agents): add knowledge-scanner, format-detector, article-analyzer, and relationship-builder agents" +``` + +--- + +## Task 12: Create Format Guides + +**Files:** +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/obsidian.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/logseq.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/dendron.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/foam.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/karpathy.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/zettelkasten.md` +- Create: `understand-anything-plugin/skills/understand-knowledge/formats/plain.md` + +**IMPORTANT**: Each format guide must be **research-backed**. The implementing agent MUST: +1. Use WebSearch and WebFetch to read the **official documentation** for each format +2. Study the actual parsing rules, not assumptions +3. Include specific syntax examples from real documentation + +- [ ] **Step 1: Create obsidian.md format guide** + +Research Obsidian's official docs (https://help.obsidian.md/) and create `understand-anything-plugin/skills/understand-knowledge/formats/obsidian.md`: + +The guide must cover: +- Detection: `.obsidian/` directory exists +- Link syntax: `[[wikilink]]`, `[[note|alias]]`, `[[note#heading]]`, `![[embed]]` +- Metadata: YAML frontmatter between `---` delimiters +- Tags: `#tag` inline, `tags:` in frontmatter (both array and space-separated) +- Properties: Obsidian Properties (frontmatter fields rendered in UI) +- Folder semantics: Obsidian doesn't assign folder meaning by default +- Special files: `.obsidian/app.json`, `.obsidian/workspace.json` (ignore these) +- Canvas: `.canvas` files (JSON format, describe spatial layouts — extract card references) +- Dataview: inline fields `key:: value`, `[key:: value]` + +- [ ] **Step 2: Create logseq.md format guide** + +Research Logseq docs (https://docs.logseq.com/) and create `understand-anything-plugin/skills/understand-knowledge/formats/logseq.md`: + +Cover: +- Detection: `logseq/` + `pages/` directories +- Structure: `journals/YYYY_MM_DD.md` (daily notes), `pages/*.md` (named pages) +- Link syntax: `[[wikilinks]]`, `((block-references))` by UUID +- Block-based: Content is organized as bullet-point outlines +- Properties: `key:: value` syntax on blocks +- Tags: `#tag` inline, page tags via properties +- Special: `logseq/config.edn` for configuration + +- [ ] **Step 3: Create dendron.md format guide** + +Research Dendron wiki (https://wiki.dendron.so/) and create `understand-anything-plugin/skills/understand-knowledge/formats/dendron.md`: + +Cover: +- Detection: `.dendron.yml` or `*.schema.yml` files +- Hierarchy: dot-delimited filenames (`a.b.c.md`) +- Link syntax: `[[wikilinks]]` with hierarchy awareness +- Schemas: `.schema.yml` files define expected hierarchy structure +- Frontmatter: YAML with required `id` and `title` fields +- Stubs: auto-created intermediate hierarchy files + +- [ ] **Step 4: Create foam.md format guide** + +Research Foam docs (https://foambubble.github.io/foam/) and create `understand-anything-plugin/skills/understand-knowledge/formats/foam.md`: + +Cover: +- Detection: `.foam/` directory or `.vscode/foam.json` +- Link syntax: `[[wikilinks]]` plus link reference definitions at file bottom +- Placeholder links: links to non-existent files +- Frontmatter: standard YAML +- Auto-linking: Foam auto-updates links on file rename/move + +- [ ] **Step 5: Create karpathy.md format guide** + +Research Karpathy's gist (https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) and create `understand-anything-plugin/skills/understand-knowledge/formats/karpathy.md`: + +Cover: +- Detection: `raw/` + `wiki/` directories + `index.md` +- Structure: `raw/` (immutable sources), `wiki/` (compiled articles), `_meta/` (state) +- Special files: `index.md` (master page list), `log.md` (append-only operations log) +- Link style: standard markdown links (not wikilinks) +- Log parsing: `## [YYYY-MM-DD] operation | Title` entries +- Wiki articles: LLM-compiled, may have cross-references and backlinks + +- [ ] **Step 6: Create zettelkasten.md format guide** + +Research zettelkasten.de and create `understand-anything-plugin/skills/understand-knowledge/formats/zettelkasten.md`: + +Cover: +- Detection: `[[wikilinks]]` + unique ID prefixes in filenames (timestamps like `202604091234`) +- Atomic notes: one idea per note +- Unique IDs: timestamp or alphanumeric prefix in filename +- Links: `[[wikilinks]]` with optional typed links +- Frontmatter: YAML with tags, creation date +- No folder hierarchy: flat structure, connections via links only + +- [ ] **Step 7: Create plain.md format guide** + +Create `understand-anything-plugin/skills/understand-knowledge/formats/plain.md`: + +Cover: +- Detection: fallback when no other format detected +- Links: standard markdown `[text](relative/path.md)` links +- Structure: folder hierarchy provides categorization +- Headings: `#` hierarchy provides structure within files +- No special metadata expectations +- Tags: none expected (LLM infers topics) + +- [ ] **Step 8: Commit** + +```bash +git add understand-anything-plugin/skills/understand-knowledge/formats/ +git commit -m "feat(skill): add 7 research-backed format guides for knowledge base parsing" +``` + +--- + +## Task 13: Create SKILL.md + +**Files:** +- Create: `understand-anything-plugin/skills/understand-knowledge/SKILL.md` + +- [ ] **Step 1: Create the skill definition** + +Create `understand-anything-plugin/skills/understand-knowledge/SKILL.md`: + +```markdown +--- +name: understand-knowledge +description: Analyze a markdown knowledge base (Obsidian, Logseq, Dendron, Foam, Karpathy-style, Zettelkasten, or plain) to produce an interactive knowledge graph with typed relationships +argument-hint: [path/to/notes] [--ingest ] +--- + +# /understand-knowledge + +Analyze a personal knowledge base of markdown files and produce an interactive knowledge graph. + +## Arguments + +- `path/to/notes` — (optional) directory containing markdown files. Defaults to current working directory. +- `--ingest ` — (optional) incrementally add new file(s) to an existing knowledge graph. + +## Phase 0: Pre-flight + +1. Determine the target directory: + - If a path argument is provided, use it + - Otherwise use the current working directory +2. Create `.understand-anything/` and `.understand-anything/intermediate/` directories if they don't exist +3. If `--ingest` flag is present: + - Verify `.understand-anything/knowledge-graph.json` exists (error if not — must run full scan first) + - Read the existing graph + - Skip to Phase 2 with only the new/changed files +4. Get the current git commit hash (if in a git repo, otherwise use "no-git") + +## Phase 1: SCAN + +Dispatch the **knowledge-scanner** agent: + +```json +{ + "targetDir": "" +} +``` + +Wait for the agent to write `.understand-anything/intermediate/knowledge-manifest.json`. + +Report: "Scanned {totalFiles} markdown files." + +## Phase 2: FORMAT DETECTION + +Dispatch the **format-detector** agent. + +Wait for `.understand-anything/intermediate/format-detection.json`. + +Report: "Detected format: {format} (confidence: {confidence})" + +## Phase 3: ANALYZE + +Read the format detection result. Load the corresponding format guide: + +- `obsidian` → inject `skills/understand-knowledge/formats/obsidian.md` +- `logseq` → inject `skills/understand-knowledge/formats/logseq.md` +- `dendron` → inject `skills/understand-knowledge/formats/dendron.md` +- `foam` → inject `skills/understand-knowledge/formats/foam.md` +- `karpathy` → inject `skills/understand-knowledge/formats/karpathy.md` +- `zettelkasten` → inject `skills/understand-knowledge/formats/zettelkasten.md` +- `plain` → inject `skills/understand-knowledge/formats/plain.md` + +Batch the files from the manifest into groups of 15-25 files each. + +For each batch, dispatch an **article-analyzer** agent with: + +```json +{ + "projectRoot": "", + "batchFiles": [...], + "format": "", + "parsingHints": {...} +} +``` + +Inject the format guide content into each agent's context. + +Run up to 5 batches concurrently. + +Wait for all `article-batch-*.json` files. + +Report: "Analyzed {totalFiles} files across {batchCount} batches." + +## Phase 4: RELATIONSHIPS + +Dispatch the **relationship-builder** agent. + +Wait for `.understand-anything/intermediate/relationships.json`. + +Report: "Discovered {topicCount} topics, {implicitEdgeCount} implicit relationships." + +## Phase 5: ASSEMBLE + +Merge all intermediate results into a single knowledge graph: + +1. Read all `article-batch-*.json` files — collect all nodes and edges +2. Read `relationships.json` — merge in topic nodes, implicit edges, layers, and tour +3. Deduplicate nodes by ID (keep the most complete version) +4. Deduplicate edges by source+target+type +5. Assemble into `KnowledgeGraph` format: + +```json +{ + "version": "1.0", + "kind": "knowledge", + "project": { + "name": "", + "languages": [], + "frameworks": [], + "description": "Knowledge base analyzed from format", + "analyzedAt": "", + "gitCommitHash": "" + }, + "nodes": [...], + "edges": [...], + "layers": [...], + "tour": [...] +} +``` + +## Phase 6: REVIEW + +Dispatch the existing **graph-reviewer** agent to validate: +- All edge source/target IDs reference existing nodes +- No orphan nodes (nodes with zero edges) +- No duplicate node IDs +- All layers reference existing nodes +- Tour steps reference existing nodes + +Apply fixes from the reviewer. + +## Phase 7: SAVE + +1. Write `.understand-anything/knowledge-graph.json` +2. Write `.understand-anything/meta.json`: + ```json + { + "lastAnalyzedAt": "", + "gitCommitHash": "", + "version": "1.0", + "analyzedFiles": , + "knowledgeFormat": "" + } + ``` +3. Clean up `.understand-anything/intermediate/` directory +4. Report: "Knowledge graph saved with {nodeCount} nodes and {edgeCount} edges." + +## Phase 8: DASHBOARD + +Auto-trigger `/understand-dashboard` to launch the visualization. + +## Incremental Mode (--ingest) + +When `--ingest ` is specified: + +1. Read the existing `knowledge-graph.json` +2. Scan only the specified file(s) or folder +3. Skip format detection (reuse format from existing graph's metadata) +4. Run article-analyzer on only the new/changed files +5. Run relationship-builder on new nodes against the full existing graph +6. Merge new nodes/edges into the existing graph +7. Re-run graph-reviewer +8. Save updated graph +``` + +- [ ] **Step 2: Commit** + +```bash +git add understand-anything-plugin/skills/understand-knowledge/SKILL.md +git commit -m "feat(skill): add /understand-knowledge skill definition with 8-phase pipeline" +``` + +--- + +## Task 14: Build, Test & Verify End-to-End + +**Files:** +- All modified files + +- [ ] **Step 1: Build core package** + +Run: `pnpm --filter @understand-anything/core build` +Expected: Clean build, no errors + +- [ ] **Step 2: Run core tests** + +Run: `pnpm --filter @understand-anything/core test -- --run` +Expected: All tests pass, including new knowledge-schema tests + +- [ ] **Step 3: Build dashboard** + +Run: `pnpm --filter @understand-anything/dashboard build` +Expected: Clean build, no errors + +- [ ] **Step 4: Run lint** + +Run: `pnpm lint` +Expected: No lint errors + +- [ ] **Step 5: Verify skill is discoverable** + +Check that the skill file exists and has valid frontmatter: + +Run: `head -5 understand-anything-plugin/skills/understand-knowledge/SKILL.md` +Expected: Valid `---` delimited YAML with name, description, argument-hint + +- [ ] **Step 6: Verify all agents are present** + +Run: `ls understand-anything-plugin/agents/ | grep knowledge\|format\|article\|relationship` +Expected: `knowledge-scanner.md`, `format-detector.md`, `article-analyzer.md`, `relationship-builder.md` + +- [ ] **Step 7: Verify all format guides are present** + +Run: `ls understand-anything-plugin/skills/understand-knowledge/formats/` +Expected: `obsidian.md`, `logseq.md`, `dendron.md`, `foam.md`, `karpathy.md`, `zettelkasten.md`, `plain.md` + +- [ ] **Step 8: Final commit** + +```bash +git add -A +git commit -m "feat: complete /understand-knowledge implementation — knowledge base analysis skill" +``` diff --git a/docs/plans/2026-03-14-understand-anything-design.md b/docs/superpowers/specs/2026-03-14-understand-anything-design.md similarity index 100% rename from docs/plans/2026-03-14-understand-anything-design.md rename to docs/superpowers/specs/2026-03-14-understand-anything-design.md diff --git a/docs/plans/2026-03-15-homepage-design.md b/docs/superpowers/specs/2026-03-15-homepage-design.md similarity index 100% rename from docs/plans/2026-03-15-homepage-design.md rename to docs/superpowers/specs/2026-03-15-homepage-design.md diff --git a/docs/plans/2026-03-18-multi-platform-simple-design.md b/docs/superpowers/specs/2026-03-18-multi-platform-simple-design.md similarity index 100% rename from docs/plans/2026-03-18-multi-platform-simple-design.md rename to docs/superpowers/specs/2026-03-18-multi-platform-simple-design.md diff --git a/docs/plans/2026-03-21-language-agnostic-design.md b/docs/superpowers/specs/2026-03-21-language-agnostic-design.md similarity index 100% rename from docs/plans/2026-03-21-language-agnostic-design.md rename to docs/superpowers/specs/2026-03-21-language-agnostic-design.md diff --git a/docs/plans/2026-03-26-theme-system-design.md b/docs/superpowers/specs/2026-03-26-theme-system-design.md similarity index 100% rename from docs/plans/2026-03-26-theme-system-design.md rename to docs/superpowers/specs/2026-03-26-theme-system-design.md diff --git a/docs/plans/2026-03-27-token-reduction-design.md b/docs/superpowers/specs/2026-03-27-token-reduction-design.md similarity index 100% rename from docs/plans/2026-03-27-token-reduction-design.md rename to docs/superpowers/specs/2026-03-27-token-reduction-design.md diff --git a/docs/plans/2026-03-28-understand-anything-extension-design.md b/docs/superpowers/specs/2026-03-28-understand-anything-extension-design.md similarity index 100% rename from docs/plans/2026-03-28-understand-anything-extension-design.md rename to docs/superpowers/specs/2026-03-28-understand-anything-extension-design.md diff --git a/docs/plans/2026-03-29-homepage-update-design.md b/docs/superpowers/specs/2026-03-29-homepage-update-design.md similarity index 100% rename from docs/plans/2026-03-29-homepage-update-design.md rename to docs/superpowers/specs/2026-03-29-homepage-update-design.md diff --git a/docs/plans/2026-04-01-business-domain-knowledge-design.md b/docs/superpowers/specs/2026-04-01-business-domain-knowledge-design.md similarity index 100% rename from docs/plans/2026-04-01-business-domain-knowledge-design.md rename to docs/superpowers/specs/2026-04-01-business-domain-knowledge-design.md diff --git a/docs/superpowers/specs/2026-04-09-understand-knowledge-design.md b/docs/superpowers/specs/2026-04-09-understand-knowledge-design.md new file mode 100644 index 0000000..2c63252 --- /dev/null +++ b/docs/superpowers/specs/2026-04-09-understand-knowledge-design.md @@ -0,0 +1,335 @@ +# /understand-knowledge — Personal Knowledge Base Plugin Design + +## Overview + +A new `/understand-knowledge` skill within the existing Understand Anything plugin that takes any folder of markdown notes and produces an interactive knowledge graph visualized in the existing dashboard. + +Inspired by Andrej Karpathy's LLM Wiki pattern — where an LLM compiles and maintains a structured wiki from raw sources — this plugin goes further by adding typed relationship discovery and interactive graph visualization that tools like Obsidian and Logseq cannot provide. + +### Goals + +- Accept any markdown-based knowledge base (Obsidian vault, Logseq graph, Dendron workspace, Foam, Karpathy-style LLM wiki, Zettelkasten, or plain markdown) +- Auto-detect the format and adapt parsing accordingly +- Use LLM analysis to discover implicit relationships beyond explicit links +- Produce a knowledge graph with typed nodes and edges +- Visualize in the existing dashboard with knowledge-specific layout, sidebar, and reading mode + +### Non-Goals + +- Real-time sync with the knowledge base tool (Obsidian, Logseq, etc.) +- Replacing the user's existing PKM tool — this is a visualization/analysis layer on top +- Supporting non-markdown formats (PDFs, bookmarks) in v1 + +--- + +## Schema Extensions + +### New Node Types (5) + +Added to the existing `NodeType` union (currently 16 types): + +```typescript +export type NodeType = + // existing (16) + | "file" | "function" | "class" | "module" | "concept" + | "config" | "document" | "service" | "table" | "endpoint" + | "pipeline" | "schema" | "resource" + | "domain" | "flow" | "step" + // knowledge (5 new → 21 total) + | "article" | "entity" | "topic" | "claim" | "source"; +``` + +| Type | What it represents | Example | +|------|-------------------|---------| +| `article` | A wiki/note page — the primary content unit | "LLM Knowledge Bases.md" | +| `entity` | A named thing: person, tool, paper, org, project | "Andrej Karpathy", "Obsidian" | +| `topic` | A thematic cluster grouping related articles | "Personal Knowledge Management" | +| `claim` | A specific assertion, insight, or takeaway | "RAG loses context at chunk boundaries" | +| `source` | Raw/reference material that articles are compiled from | A paper URL, a raw PDF reference | + +### New Edge Types (6) + +Added to the existing `EdgeType` union (currently 29 types): + +```typescript +export type EdgeType = + // existing (29) + | ... + // knowledge (6 new → 35 total) + | "cites" | "contradicts" | "builds_on" + | "exemplifies" | "categorized_under" | "authored_by"; +``` + +| Type | Direction | Meaning | +|------|-----------|---------| +| `cites` | article → source | References or draws from | +| `contradicts` | claim → claim | Conflicts or disagrees with | +| `builds_on` | article → article | Extends, refines, or deepens | +| `exemplifies` | entity → concept/topic | Is a concrete example of | +| `categorized_under` | article/entity → topic | Belongs to this theme | +| `authored_by` | article → entity | Written or created by | + +### New Metadata Interface + +```typescript +export interface KnowledgeMeta { + format?: "obsidian" | "logseq" | "dendron" | "foam" | "karpathy" | "zettelkasten" | "plain"; + wikilinks?: string[]; + backlinks?: string[]; + frontmatter?: Record; + sourceUrl?: string; + confidence?: number; // 0-1, for LLM-inferred relationships +} +``` + +Added as an optional field on `GraphNode`: + +```typescript +export interface GraphNode { + // ...existing fields + knowledgeMeta?: KnowledgeMeta; +} +``` + +### Graph-Level Kind Flag + +```typescript +export interface KnowledgeGraph { + version: string; + kind: "codebase" | "knowledge"; // NEW + project: ProjectMeta; + nodes: GraphNode[]; + edges: GraphEdge[]; + layers: Layer[]; + tour: TourStep[]; +} +``` + +The `kind` field tells the dashboard which layout, sidebar, and visual styling to use. For backward compatibility, graphs without a `kind` field default to `"codebase"`. + +--- + +## Format Detection & Format Guides + +### Auto-Detection Logic + +Scans the target directory for signature files/patterns. Priority order (first match wins): + +| Priority | Signal | Detected Format | +|----------|--------|----------------| +| 1 | `.obsidian/` directory | Obsidian | +| 2 | `logseq/` + `pages/` directories | Logseq | +| 3 | `.dendron.yml` or `*.schema.yml` | Dendron | +| 4 | `.foam/` or `.vscode/foam.json` | Foam | +| 5 | `raw/` + `wiki/` + `index.md` | Karpathy | +| 6 | `[[wikilinks]]` + unique ID prefixes in filenames | Zettelkasten | +| 7 | Fallback | Plain markdown | + +### Format Guides + +Located at `skills/understand-knowledge/formats/`. Each guide tells the LLM agents how to parse that format: + +``` +skills/understand-knowledge/ + SKILL.md + formats/ + obsidian.md — [[wikilinks]], [[note|alias]], [[note#heading]], + #tags, YAML frontmatter, .obsidian/ config, + dataview annotations, canvas files + logseq.md — block-based outliner, ((block-refs)), + journals/YYYY_MM_DD.md, pages/, + property:: value syntax, TODO/DONE states + dendron.md — dot-delimited hierarchy (a.b.c.md), + .schema.yml for structure validation, + cross-vault links, refactoring rules + foam.md — [[wikilinks]] + link reference definitions + at file bottom, .foam/config, placeholder links + karpathy.md — raw/ → wiki/ pipeline, index.md master map, + log.md append-only record, _meta/ state, + LLM-maintained cross-references + zettelkasten.md — atomic notes, unique ID prefixes (timestamps), + typed semantic links, one idea per note + plain.md — standard [markdown](links), folder hierarchy, + heading structure, no special conventions +``` + +Each format guide covers: +- How to parse links (wikilinks vs standard vs block refs) +- Where metadata lives (frontmatter vs inline properties vs block properties) +- What the folder structure means (journals/ = daily notes, pages/ = permanent notes) +- What conventions to respect vs what to infer + +### Format Guide Authoring Process + +Format guides must be research-backed. During implementation, the agent building each format guide must: +1. Read the official documentation for that format (Obsidian Help, Logseq docs, Dendron wiki, Foam docs, etc.) +2. Study real-world examples of that format's structure +3. Write the guide based on verified behavior, not assumptions + +--- + +## Agent Pipeline + +``` +knowledge-scanner → format-detector → article-analyzer → relationship-builder → graph-reviewer +``` + +### Agent Definitions + +| Agent | Input | Output | Model | +|-------|-------|--------|-------| +| `knowledge-scanner` | Target directory path | File manifest: all `.md` files with paths, sizes, first 20 lines preview | `inherit` | +| `format-detector` | File manifest + directory structure | Detected format + format-specific parsing hints | `inherit` | +| `article-analyzer` | Individual `.md` file + format guide | Per-file nodes (article, entities, claims) + explicit edges (wikilinks, tags) | `inherit` | +| `relationship-builder` | All per-file results | Cross-file implicit edges (builds_on, contradicts, categorized_under) + topic clustering + layers | `inherit` | +| `graph-reviewer` | Assembled graph | Validated graph — deduped entities, consistent edge weights, orphan detection | `inherit` | + +### Key Differences from Codebase Pipeline + +- **No tree-sitter** — markdown parsing is simpler, mostly regex + LLM interpretation +- **format-detector** replaces framework detection — picks the right format guide +- **article-analyzer** replaces file-analyzer — extracts knowledge concepts instead of code structure +- **relationship-builder** is the heavy LLM step — discovers implicit connections across files that explicit links miss +- **graph-reviewer** stays similar — validates the assembled graph for consistency + +### Intermediate Files + +Same pattern as codebase analysis: + +``` +.understand-anything/intermediate/ + knowledge-manifest.json — scanner output + format-detection.json — detected format + hints + article-*.json — per-file analysis + relationships.json — cross-file edges + knowledge-graph.json — final assembled graph +``` + +Intermediate files are cleaned up after graph assembly (same as codebase flow). + +### Incremental Mode (`--ingest`) + +When the user runs `/understand-knowledge --ingest path/to/new-source.md`: + +1. **knowledge-scanner** — runs on just the new file(s) +2. **format-detector** — skipped (format already known from initial scan) +3. **article-analyzer** — processes only new/changed files +4. **relationship-builder** — runs on new nodes against the existing graph, finds connections to what's already there +5. **graph-reviewer** — validates the merged result + +Existing nodes are preserved; only new nodes/edges are added or updated. + +--- + +## Dashboard Changes + +All changes are scoped to graphs with `"kind": "knowledge"`. + +### Vertical Flow Layout + +- Default to top-down vertical layout (like existing domain/business flow view) +- Topics at top → articles in middle → entities/claims/sources at bottom +- Reads like a knowledge hierarchy: broad themes flow down into specifics +- User can still switch to horizontal or force-directed layout via controls + +### Knowledge Sidebar + +Replaces NodeInfo when a knowledge graph is loaded: + +| Selection | Sidebar Shows | +|-----------|---------------| +| Nothing selected | ProjectOverview: format detected, total articles/entities/topics/claims/sources | +| Article node | Title, summary, tags, frontmatter metadata, backlinks list (clickable), outgoing links, related topics | +| Entity node | Name, type (person/tool/paper/org), articles that mention it, relationships to other entities | +| Topic node | Description, child articles, child entities, cross-topic connections | +| Claim node | Assertion text, supporting articles, contradicting claims (if any), confidence score | +| Source node | Original URL/path, articles that cite it, ingestion date | + +### Reading Mode + +- Clicking an article node triggers a reading panel that slides up from the bottom (same pattern as current code viewer overlay) +- Shows the full compiled markdown rendered as HTML +- Includes a mini backlinks sidebar within the panel +- Clicking a `[[wikilink]]` or entity reference in the reading panel navigates the graph to that node + +### Node Visual Styling + +| Node Type | Shape | Color Accent | +|-----------|-------|-------------| +| `article` | Rounded rectangle | Warm amber | +| `entity` | Circle | Soft blue | +| `topic` | Large rounded rectangle | Muted gold | +| `claim` | Diamond | Green/red depending on contradictions | +| `source` | Small square | Gray | + +### Edge Visual Styling + +| Edge Type | Style | +|-----------|-------| +| `cites` | Dashed line | +| `contradicts` | Red line | +| `builds_on` | Solid with arrow | +| `categorized_under` | Thin gray | +| `authored_by` | Dotted blue | +| `exemplifies` | Dotted green | + +--- + +## Skill Interface + +### Usage + +```bash +# Full scan — first time or rescan +/understand-knowledge + +# Point at a specific directory +/understand-knowledge path/to/my-notes + +# Incremental ingest — add new sources to existing graph +/understand-knowledge --ingest path/to/new-note.md +/understand-knowledge --ingest path/to/new-folder/ +``` + +### Behavior + +1. Auto-detects format (Obsidian, Logseq, Karpathy, etc.) +2. Announces: "Detected Obsidian vault with 342 notes. Scanning..." +3. Runs the agent pipeline (scanner → detector → analyzer → relationship-builder → reviewer) +4. Writes `knowledge-graph.json` to `.understand-anything/` with `"kind": "knowledge"` +5. Auto-triggers `/understand-dashboard` after completion + +### File Structure + +``` +skills/understand-knowledge/ + SKILL.md — skill entry point, orchestration logic + formats/ + obsidian.md + logseq.md + dendron.md + foam.md + karpathy.md + zettelkasten.md + plain.md +``` + +### Coexistence with `/understand` + +- `/understand` produces `"kind": "codebase"` graphs +- `/understand-knowledge` produces `"kind": "knowledge"` graphs +- Both write to `.understand-anything/knowledge-graph.json` +- Running one replaces the other +- To scope knowledge analysis to a subdirectory (e.g., `docs/` within a code repo), use `/understand-knowledge path/to/docs` + +--- + +## What This Enables That Nothing Else Does + +| Existing Tools | Limitation | Our Advantage | +|---------------|-----------|---------------| +| Obsidian graph view | Untyped edges — all links look the same | Typed edges: cites, contradicts, builds_on | +| Logseq graph | Only shows explicit links | LLM discovers implicit relationships | +| All PKM tools | Single-format only | Cross-format support with auto-detection | +| Karpathy LLM Wiki | Flat text wiki, no visualization | Interactive graph dashboard with guided tours | +| None | No knowledge graph tours | Tour mode walks through a knowledge base step by step | diff --git a/understand-anything-plugin/agents/article-analyzer.md b/understand-anything-plugin/agents/article-analyzer.md new file mode 100644 index 0000000..3090a91 --- /dev/null +++ b/understand-anything-plugin/agents/article-analyzer.md @@ -0,0 +1,175 @@ +--- +name: article-analyzer +description: Analyzes individual markdown files to extract knowledge nodes and explicit edges +model: inherit +--- + +# Article Analyzer + +You are a knowledge extraction specialist. Your job is to read markdown files from a personal knowledge base and extract structured knowledge graph data: nodes representing articles, entities, claims, and sources, plus edges representing explicit relationships between them. Precision matters -- every node and edge must be grounded in actual file content. + +## Task + +Process a batch of markdown files and extract structured knowledge graph nodes and edges. You will read each file's full content, identify key entities, claims, and sources, and map explicit relationships. + +--- + +## Input + +You receive a JSON object in the prompt with: +- `projectRoot` (string): Absolute path to the knowledge base root directory +- `batchFiles` (array of strings): Relative paths to the markdown files in this batch +- `format` (string): Detected PKM format (e.g., `obsidian`, `logseq`, `plain`) +- `parsingHints` (object): Format-specific parsing guidance with fields: `linkStyle`, `metadataLocation`, `folderSemantics`, `specialFiles`, `tagSyntax` +- `batchIndex` (integer): The batch number (used for output filename) + +## Node Extraction + +For each markdown file in `batchFiles`, read its full content and extract the following node types: + +### Article Nodes + +One per file. These represent the markdown file itself. + +| Field | Value | +|---|---| +| `id` | `article:` (e.g., `article:notes/machine-learning`) | +| `type` | `article` | +| `name` | Extracted from: (1) first `# heading` in the file, (2) `title` field in YAML frontmatter, (3) filename without extension | +| `summary` | 2-3 sentence summary of the file's main content and purpose | +| `filePath` | Relative path to the file (e.g., `notes/machine-learning.md`) | +| `tags` | Array of tags extracted from frontmatter `tags` field, inline `#tag` syntax, or empty array | +| `complexity` | `simple` if < 100 lines, `moderate` if 101-300, `complex` if > 300 | +| `knowledgeMeta` | `{ nodeType: "article" }` | + +### Entity Nodes + +Named things referenced in the file: people, tools, software, papers, organizations, concepts. Only extract entities that are significant to the article's content (mentioned substantively, not just in passing). + +| Field | Value | +|---|---| +| `id` | `entity:` (e.g., `entity:transformer-architecture`) | +| `type` | `entity` | +| `name` | The entity's display name as it appears in context | +| `summary` | 1 sentence describing what this entity is, based on context in the file | +| `tags` | Array with the entity category: `["person"]`, `["tool"]`, `["paper"]`, `["organization"]`, `["concept"]`, etc. | +| `complexity` | `simple` | +| `knowledgeMeta` | `{ nodeType: "entity", entityCategory: "" }` | + +### Claim Nodes + +Significant assertions, arguments, or conclusions made in the article. Only extract claims that represent a notable stance, finding, or argument -- not every sentence. + +| Field | Value | +|---|---| +| `id` | `claim::` (e.g., `claim:notes/ml:attention-is-key`) | +| `type` | `claim` | +| `name` | Short label for the claim (5-10 words) | +| `summary` | The full claim as stated or paraphrased from the article | +| `tags` | `["claim"]` | +| `complexity` | `simple` | +| `knowledgeMeta` | `{ nodeType: "claim" }` | + +### Source Nodes + +External references: URLs, papers, books, or other cited works. + +| Field | Value | +|---|---| +| `id` | `source:` (e.g., `source:arxiv-1706-03762` or `source:designing-data-intensive-applications`) | +| `type` | `source` | +| `name` | The source's title or URL | +| `summary` | 1 sentence describing the source based on how it's referenced | +| `tags` | `["source"]` | +| `complexity` | `simple` | +| `knowledgeMeta` | `{ nodeType: "source", sourceUrl: "" }` | + +### Node ID Conventions + +All node IDs must follow these rules: +- Lowercase only +- Use hyphens `-` for spaces +- Remove special characters (parentheses, quotes, colons, etc.) +- Use forward slashes `/` for path separators in article and claim IDs +- Examples: `entity:andrej-karpathy`, `article:notes/deep-learning`, `claim:notes/ml:transformers-outperform-rnns` + +## Edge Extraction + +Extract **explicit** relationships found directly in the file content: + +| Relationship source | Edge type | Weight | +|---|---|---| +| `[[wikilink]]` to another article | `related` | 0.5 | +| Frontmatter `category` or `parent` references | `categorized_under` | 0.7 | +| Citation or reference to a source | `cites` | 0.8 | +| Explicit author attribution | `authored_by` | 0.9 | +| Article contains a claim | `contains` | 1.0 | +| Article mentions an entity | `contains` | 1.0 | + +Each edge has: +- `source` (string): Source node ID +- `target` (string): Target node ID +- `type` (string): One of `related`, `categorized_under`, `cites`, `authored_by`, `contains` +- `description` (string): Human-readable label (e.g., `"links to"`, `"cites"`, `"authored by"`) +- `weight` (number): As specified in the table above +- `knowledgeMeta` (object): `{ edgeKind: "explicit" }` + +## Deduplication + +Deduplicate entity and source nodes **within this batch**: +- If two files reference the same entity (same normalized name), merge them into a single node. Combine summaries and union tags. +- If two files cite the same source (same URL or normalized title), merge into a single node. + +## Output + +Write the batch results to: `/.understand-anything/intermediate/article-batch-.json` + +The JSON must have this exact structure: + +```json +{ + "nodes": [ + { + "id": "article:notes/machine-learning", + "type": "article", + "name": "Machine Learning Overview", + "summary": "An introduction to core ML concepts...", + "filePath": "notes/machine-learning.md", + "tags": ["ml", "overview"], + "complexity": "moderate", + "knowledgeMeta": { "nodeType": "article" } + }, + { + "id": "entity:transformer-architecture", + "type": "entity", + "name": "Transformer Architecture", + "summary": "A neural network architecture based on self-attention...", + "tags": ["concept"], + "complexity": "simple", + "knowledgeMeta": { "nodeType": "entity", "entityCategory": "concept" } + } + ], + "edges": [ + { + "source": "article:notes/machine-learning", + "target": "entity:transformer-architecture", + "type": "contains", + "description": "discusses", + "weight": 1.0, + "knowledgeMeta": { "edgeKind": "explicit" } + } + ] +} +``` + +## Critical Constraints + +- ALWAYS read the actual file content before extracting nodes and edges. Never fabricate content. +- NEVER create nodes for entities that are only mentioned in passing (e.g., common words, generic references). +- ALWAYS use the node ID conventions specified above. +- ALWAYS deduplicate entities and sources within the batch. +- Limit claim extraction to genuinely significant assertions (typically 0-3 per article). +- Limit entity extraction to substantively discussed entities (typically 2-8 per article). +- Respond with ONLY a brief text summary: number of files processed, total nodes extracted (by type), total edges extracted. + +Do NOT include the full JSON in your text response. diff --git a/understand-anything-plugin/agents/format-detector.md b/understand-anything-plugin/agents/format-detector.md new file mode 100644 index 0000000..0b8c576 --- /dev/null +++ b/understand-anything-plugin/agents/format-detector.md @@ -0,0 +1,92 @@ +--- +name: format-detector +description: Detects the knowledge base format from directory signatures and file samples +model: inherit +--- + +# Format Detector + +You are a knowledge base format identification specialist. Your job is to read a scanned manifest of markdown files, analyze the directory signatures and file samples, and determine which Personal Knowledge Management (PKM) format is being used. Your detection must be precise and based on concrete evidence. + +## Task + +Read the knowledge manifest produced by the knowledge-scanner agent and determine the knowledge base format. Output a format detection result with parsing hints for downstream agents. + +--- + +## Input + +Read the manifest from: `/.understand-anything/intermediate/knowledge-manifest.json` + +The `projectRoot` is provided in your dispatch prompt. + +## Detection Logic + +Apply the following detection rules in priority order. **First match wins:** + +| Priority | Condition | Format | Confidence | +|---|---|---|---| +| 1 | `hasObsidianDir` is true | `obsidian` | 0.95 | +| 2 | `hasLogseqDir` is true | `logseq` | 0.95 | +| 3 | `hasDendronConfig` is true | `dendron` | 0.95 | +| 4 | `hasFoamConfig` is true | `foam` | 0.90 | +| 5 | `hasKarpathyStructure` is true | `karpathy` | 0.85 | +| 6 | `hasWikilinks` is true AND `hasUniqueIdPrefixes` is true | `zettelkasten` | 0.70 | +| 7 | None of the above match | `plain` | 0.50 | + +**Confidence adjustments:** +- If the primary signal is present AND additional supporting signals exist (e.g., Obsidian dir + wikilinks), increase confidence by 0.05 (max 1.0). +- If the primary signal is present but the file count is very low (< 5 files), decrease confidence by 0.10. + +## Parsing Hints + +Based on the detected format, set these parsing hints: + +| Format | linkStyle | metadataLocation | folderSemantics | specialFiles | tagSyntax | +|---|---|---|---|---|---| +| `obsidian` | `wikilink` | `yaml-frontmatter` | `vault-folders-as-categories` | `["_templates/", "_attachments/"]` | `#tag or yaml tags` | +| `logseq` | `wikilink` | `page-properties` | `pages-and-journals` | `["pages/", "journals/", "logseq/"]` | `#tag` | +| `dendron` | `wikilink` | `yaml-frontmatter` | `dot-separated-hierarchy` | `["*.schema.yml"]` | `yaml tags` | +| `foam` | `wikilink` | `yaml-frontmatter` | `flat-or-folders` | `[".foam/"]` | `#tag or yaml tags` | +| `karpathy` | `markdown` | `none-or-minimal` | `raw-wiki-split` | `["raw/", "wiki/", "index.md"]` | `none` | +| `zettelkasten` | `wikilink` | `yaml-frontmatter` | `flat-with-id-prefixes` | `[]` | `#tag or yaml tags` | +| `plain` | `markdown` | `yaml-frontmatter-if-present` | `folder-based` | `[]` | `#tag if present` | + +## Output + +Write the detection result to: `/.understand-anything/intermediate/format-detection.json` + +The JSON must have this exact structure: + +```json +{ + "format": "obsidian", + "confidence": 0.95, + "parsingHints": { + "linkStyle": "wikilink", + "metadataLocation": "yaml-frontmatter", + "folderSemantics": "vault-folders-as-categories", + "specialFiles": ["_templates/", "_attachments/"], + "tagSyntax": "#tag or yaml tags" + } +} +``` + +**Field requirements:** +- `format` (string): One of `obsidian`, `logseq`, `dendron`, `foam`, `karpathy`, `zettelkasten`, `plain` +- `confidence` (number): Between 0 and 1, rounded to 2 decimal places +- `parsingHints` (object): All 5 fields must be present +- `parsingHints.linkStyle` (string): How links between notes are written +- `parsingHints.metadataLocation` (string): Where note metadata is stored +- `parsingHints.folderSemantics` (string): What folder structure means in this format +- `parsingHints.specialFiles` (string[]): Directories or file patterns with special meaning +- `parsingHints.tagSyntax` (string): How tags are written + +## Critical Constraints + +- ALWAYS apply detection rules in priority order. First match wins. +- NEVER guess the format without evidence from the directory signatures. +- ALWAYS include all 5 parsing hint fields. +- Respond with ONLY a brief text summary: detected format, confidence level, and the key signal that determined the format. + +Do NOT include the full JSON in your text response. diff --git a/understand-anything-plugin/agents/knowledge-scanner.md b/understand-anything-plugin/agents/knowledge-scanner.md new file mode 100644 index 0000000..13eff9d --- /dev/null +++ b/understand-anything-plugin/agents/knowledge-scanner.md @@ -0,0 +1,114 @@ +--- +name: knowledge-scanner +description: Scans a directory for markdown files and produces a file manifest for knowledge base analysis +model: inherit +--- + +# Knowledge Scanner + +You are a precise file discovery specialist for personal knowledge bases. Your job is to scan a directory, find all markdown files, detect directory structure signatures that indicate which PKM (Personal Knowledge Management) tool was used, and produce a structured manifest. Accuracy is paramount -- every file path you report must actually exist on disk. + +## Task + +Scan the target directory provided in the prompt and produce a JSON manifest of all markdown files found. You will also detect directory structure signatures that help downstream agents identify the knowledge base format. + +--- + +## Input + +You receive a JSON object in the prompt with: +- `targetDir` (string): Absolute path to the directory to scan. + +## Step 1 -- File Discovery + +Use Glob or Bash to find all `.md` files recursively under `targetDir`. + +**Exclude** files in any of these directories: +- `.obsidian/` +- `logseq/` +- `.foam/` +- `_meta/` +- `node_modules/` +- `.git/` +- `.understand-anything/` + +These directories are excluded from the **file list** only. Their mere presence on disk is still relevant for directory signature detection (Step 2). + +Sort all discovered file paths alphabetically by their path relative to `targetDir`. + +## Step 2 -- Directory Signature Detection + +Check for the presence of these directory-level signals. Each is a boolean: + +| Signal | How to detect | +|---|---| +| `hasObsidianDir` | A `.obsidian/` directory exists directly under `targetDir` | +| `hasLogseqDir` | A `logseq/` directory AND a `pages/` directory exist under `targetDir` | +| `hasDendronConfig` | A `.dendron.yml` file exists under `targetDir` | +| `hasFoamConfig` | A `.foam/` directory exists under `targetDir` | +| `hasKarpathyStructure` | Both a `raw/` directory and a `wiki/` directory exist, and an `index.md` file exists under `targetDir` | +| `hasWikilinks` | At least one `[[wikilink]]` pattern is found in the preview text of any sampled file (check the first 20 lines of up to 20 files) | +| `hasUniqueIdPrefixes` | At least 3 filenames start with a numeric unique ID prefix (e.g., `202301011200 My Note.md`, `20230101-topic.md`, or similar Zettelkasten-style IDs of 8+ digits) | + +## Step 3 -- File Metadata Collection + +For each discovered markdown file, collect: +- `path` (string): Path relative to `targetDir` +- `sizeLines` (integer): Total line count of the file +- `preview` (string): The first 20 lines of the file, joined by newlines + +**Do NOT read full file contents beyond the first 20 lines.** For efficiency, batch file reads where possible. + +## Output + +Create the output directory if needed: +```bash +mkdir -p /.understand-anything/intermediate +``` + +Write the manifest to: `/.understand-anything/intermediate/knowledge-manifest.json` + +The JSON must have this exact structure: + +```json +{ + "targetDir": "/absolute/path/to/dir", + "totalFiles": 142, + "directorySignatures": { + "hasObsidianDir": true, + "hasLogseqDir": false, + "hasDendronConfig": false, + "hasFoamConfig": false, + "hasKarpathyStructure": false, + "hasWikilinks": true, + "hasUniqueIdPrefixes": false + }, + "files": [ + { + "path": "folder/note.md", + "sizeLines": 85, + "preview": "# Note Title\n\nFirst paragraph of the note..." + } + ] +} +``` + +**Field requirements:** +- `targetDir` (string): The absolute path that was scanned (from input) +- `totalFiles` (integer): Must equal `files.length` +- `directorySignatures` (object): All 7 boolean fields must be present +- `files` (array): Every discovered `.md` file, sorted alphabetically by `path` +- `files[].path` (string): Relative to `targetDir`, using forward slashes +- `files[].sizeLines` (integer): Actual line count from disk +- `files[].preview` (string): First 20 lines of the file + +## Critical Constraints + +- NEVER invent or guess file paths. Every `path` in `files` must come from actual file discovery on disk. +- NEVER read full file contents beyond the first 20 lines. +- ALWAYS validate that `totalFiles` matches the actual length of the `files` array. +- ALWAYS sort `files` by `path` alphabetically for deterministic output. +- Do NOT include non-markdown files in the manifest. +- Respond with ONLY a brief text summary: target directory, total markdown files found, and which directory signatures were detected as true. + +Do NOT include the full JSON in your text response. diff --git a/understand-anything-plugin/agents/relationship-builder.md b/understand-anything-plugin/agents/relationship-builder.md new file mode 100644 index 0000000..3eea95b --- /dev/null +++ b/understand-anything-plugin/agents/relationship-builder.md @@ -0,0 +1,193 @@ +--- +name: relationship-builder +description: Discovers implicit cross-file relationships and builds topic clusters from analyzed knowledge nodes +model: inherit +--- + +# Relationship Builder + +You are a knowledge synthesis specialist. Your job is to analyze all extracted knowledge nodes across a knowledge base, discover implicit relationships between them, build thematic topic clusters, and create a guided tour. You work at the macro level -- connecting ideas across files that the article-level analysis could not see. + +## Task + +Read all article batch results, deduplicate entities globally, discover implicit cross-file relationships, build topic clusters with layers, and create a guided tour of the knowledge base. + +--- + +## Input + +Read all batch result files from: `/.understand-anything/intermediate/article-batch-*.json` + +The `projectRoot` is provided in your dispatch prompt. + +Each batch file contains `{ nodes: [...], edges: [...] }` as produced by the article-analyzer agent. + +## Step 1 -- Global Entity Deduplication + +Merge all nodes from all batch files into a single list. Then deduplicate: + +- **Entity nodes**: If two entity nodes have the same `id`, merge them: + - Combine summaries (keep the more informative one, or merge if both add value) + - Union their `tags` arrays (deduplicate) + - Keep all other fields from the first occurrence +- **Source nodes**: Same merging logic as entities +- **Article and claim nodes**: These should already be unique (one per file/claim). If duplicates exist, keep the first occurrence. + +Also merge all edges from all batch files into a single list. Remove exact duplicate edges (same source, target, and type). + +## Step 2 -- Implicit Edge Discovery + +Analyze the merged node set to discover relationships that span across files. These are relationships that no single article-analyzer batch could detect. + +Discover these types of implicit edges: + +| Edge type | What to look for | Weight range | +|---|---|---| +| `builds_on` | Article B extends, refines, or deepens ideas from Article A | 0.5-0.8 | +| `contradicts` | Two claims or articles present conflicting positions | 0.5-0.7 | +| `categorized_under` | Multiple articles share a common theme or topic | 0.5-0.7 | +| `exemplifies` | An article provides a concrete example of a concept discussed elsewhere | 0.5-0.7 | +| `related` | Articles share significant thematic overlap, common entities, or complementary perspectives | 0.4-0.6 | + +For each implicit edge: +- `source` (string): Source node ID +- `target` (string): Target node ID +- `type` (string): One of the types above +- `description` (string): Human-readable description of the relationship +- `weight` (number): Within the range specified above +- `knowledgeMeta` (object): `{ edgeKind: "implicit", confidence: <0-1> }` + +**Confidence scoring:** +- 0.8-1.0: Strong evidence (shared entities, explicit thematic overlap, direct conceptual extension) +- 0.6-0.8: Moderate evidence (shared tags, similar topics, related domains) +- 0.4-0.6: Weak evidence (loose thematic connection, tangential overlap) + +**Only add edges with confidence > 0.4.** Do NOT duplicate edges that already exist from the article-analyzer batches (same source, target, and type). + +## Step 3 -- Topic Cluster Building + +Identify thematic clusters of 3 or more articles that share a common theme. For each cluster: + +| Field | Value | +|---|---| +| `id` | `topic:` (e.g., `topic:machine-learning`) | +| `type` | `topic` | +| `name` | A descriptive name for the topic cluster | +| `summary` | 1-2 sentence description of what this topic cluster covers | +| `tags` | `["topic"]` | +| `complexity` | `simple` | +| `knowledgeMeta` | `{ nodeType: "topic" }` | + +For each article in a topic cluster, add a `categorized_under` edge (using `type: "categorized_under"`) from the article to the topic node (if one does not already exist), with weight 0.7 and `knowledgeMeta: { edgeKind: "implicit", confidence: 0.7 }`. + +An article may belong to multiple topic clusters if it genuinely spans multiple themes. + +## Step 4 -- Layer Building + +Create one layer per topic cluster: + +```json +{ + "id": "layer-", + "name": "", + "description": "<1-2 sentence description of this layer's theme>", + "nodeIds": ["article:...", "entity:...", "claim:..."] +} +``` + +Each layer contains the IDs of all nodes that belong to that topic: the topic node itself, all articles categorized under it, and any entities/claims/sources that are directly connected to those articles. + +Create an additional `"Uncategorized"` layer containing any article nodes that were not assigned to any topic cluster, plus their directly connected entities/claims/sources. + +## Step 5 -- Tour Building + +Create a guided tour of the knowledge base with 5-10 steps. The tour should walk through the knowledge base in a logical order, helping a newcomer understand the major themes and key ideas. + +Each tour step: + +```json +{ + "order": 1, + "title": "Step title", + "description": "2-3 sentences explaining why this article matters and what to look for.", + "nodeIds": ["article:some-article"] +} +``` + +Tour guidelines: +- Start with the most foundational or introductory article +- Progress from general to specific +- Cover all major topic clusters +- End with the most advanced or synthesizing article +- Each step should reference an article node (not entities or claims) + +## Output + +Write the results to: `/.understand-anything/intermediate/relationships.json` + +The JSON must have this exact structure: + +```json +{ + "nodes": [ + { + "id": "topic:machine-learning", + "type": "topic", + "name": "Machine Learning", + "summary": "Articles exploring ML concepts, architectures, and applications.", + "tags": ["topic"], + "complexity": "simple", + "knowledgeMeta": { "nodeType": "topic" } + } + ], + "edges": [ + { + "source": "article:notes/transformers", + "target": "article:notes/attention", + "type": "builds_on", + "description": "extends attention mechanism concepts", + "weight": 0.7, + "knowledgeMeta": { "edgeKind": "implicit", "confidence": 0.75 } + } + ], + "layers": [ + { + "id": "layer-machine-learning", + "name": "Machine Learning", + "description": "Articles exploring ML concepts, architectures, and applications.", + "nodeIds": ["topic:machine-learning", "article:notes/ml", "entity:transformer-architecture"] + }, + { + "id": "layer-uncategorized", + "name": "Uncategorized", + "description": "Articles not assigned to any specific topic cluster.", + "nodeIds": ["article:misc/random-thoughts"] + } + ], + "tour": [ + { + "order": 1, + "title": "Starting with the Basics", + "description": "This article provides a foundational overview of machine learning concepts that many other notes build upon.", + "nodeIds": ["article:notes/intro-to-ml"] + } + ] +} +``` + +**Field requirements:** +- `nodes` (array): Only NEW nodes created in this step (topic nodes). Do NOT include article/entity/claim/source nodes already in batch files. +- `edges` (array): Only NEW implicit edges discovered in this step. Do NOT duplicate edges from article-analyzer batches. +- `layers` (array): One per topic cluster plus one "Uncategorized" layer. Every article node must appear in at least one layer. +- `tour` (array): 5-10 steps, each referencing an article node. + +## Critical Constraints + +- NEVER duplicate edges that already exist in the article-analyzer batch files. +- NEVER add implicit edges with confidence <= 0.4. +- ALWAYS deduplicate entities globally before discovering relationships. +- ALWAYS ensure every article node appears in at least one layer. +- Topic clusters must have at least 3 articles. Do not create trivial clusters. +- Respond with ONLY a brief text summary: number of topic clusters found, number of implicit edges discovered, number of tour steps, and any globally deduplicated entities. + +Do NOT include the full JSON in your text response. diff --git a/understand-anything-plugin/packages/core/src/__tests__/knowledge-schema.test.ts b/understand-anything-plugin/packages/core/src/__tests__/knowledge-schema.test.ts new file mode 100644 index 0000000..2d609be --- /dev/null +++ b/understand-anything-plugin/packages/core/src/__tests__/knowledge-schema.test.ts @@ -0,0 +1,201 @@ +import { describe, it, expect } from "vitest"; +import { validateGraph } from "../schema"; +import type { KnowledgeGraph } from "../types"; + +describe("knowledge graph schema validation", () => { + const minimalKnowledgeGraph: KnowledgeGraph = { + version: "1.0", + kind: "knowledge", + project: { + name: "Test KB", + languages: [], + frameworks: [], + description: "A test knowledge base", + analyzedAt: new Date().toISOString(), + gitCommitHash: "abc123", + }, + nodes: [ + { + id: "article:test-note", + type: "article", + name: "Test Note", + summary: "A test article node", + tags: ["test"], + complexity: "simple", + }, + { + id: "entity:karpathy", + type: "entity", + name: "Andrej Karpathy", + summary: "AI researcher", + tags: ["person", "ai"], + complexity: "simple", + }, + { + id: "topic:pkm", + type: "topic", + name: "Personal Knowledge Management", + summary: "Tools and methods for managing personal knowledge", + tags: ["knowledge", "productivity"], + complexity: "moderate", + }, + ], + edges: [ + { + source: "article:test-note", + target: "entity:karpathy", + type: "authored_by", + direction: "forward", + weight: 0.8, + }, + { + source: "article:test-note", + target: "topic:pkm", + type: "categorized_under", + direction: "forward", + weight: 0.7, + }, + ], + layers: [ + { + id: "layer:pkm", + name: "PKM", + description: "Personal Knowledge Management topic cluster", + nodeIds: ["article:test-note", "topic:pkm"], + }, + ], + tour: [], + }; + + it("validates a minimal knowledge graph", () => { + const result = validateGraph(minimalKnowledgeGraph); + const fatals = result.issues.filter((i) => i.level === "fatal"); + expect(fatals).toHaveLength(0); + }); + + it("preserves kind field through validation", () => { + const result = validateGraph(minimalKnowledgeGraph); + expect(result.data!.kind).toBe("knowledge"); + }); + + it("accepts all knowledge node types", () => { + const graph = { + ...minimalKnowledgeGraph, + nodes: [ + ...minimalKnowledgeGraph.nodes, + { id: "claim:rag-bad", type: "claim" as const, name: "RAG loses context", summary: "An assertion", tags: ["claim"], complexity: "simple" as const }, + { id: "source:paper1", type: "source" as const, name: "Attention paper", summary: "A source", tags: ["paper"], complexity: "simple" as const }, + ], + }; + const result = validateGraph(graph); + const fatals = result.issues.filter((i) => i.level === "fatal"); + expect(fatals).toHaveLength(0); + }); + + it("accepts all knowledge edge types", () => { + const graph = { + ...minimalKnowledgeGraph, + nodes: [ + ...minimalKnowledgeGraph.nodes, + { id: "claim:c1", type: "claim" as const, name: "Claim 1", summary: "c1", tags: [], complexity: "simple" as const }, + { id: "claim:c2", type: "claim" as const, name: "Claim 2", summary: "c2", tags: [], complexity: "simple" as const }, + { id: "source:s1", type: "source" as const, name: "Source 1", summary: "s1", tags: [], complexity: "simple" as const }, + { id: "article:a2", type: "article" as const, name: "Article 2", summary: "a2", tags: [], complexity: "simple" as const }, + ], + edges: [ + ...minimalKnowledgeGraph.edges, + { source: "article:test-note", target: "source:s1", type: "cites" as const, direction: "forward" as const, weight: 0.7 }, + { source: "claim:c1", target: "claim:c2", type: "contradicts" as const, direction: "forward" as const, weight: 0.6 }, + { source: "article:a2", target: "article:test-note", type: "builds_on" as const, direction: "forward" as const, weight: 0.7 }, + { source: "entity:karpathy", target: "topic:pkm", type: "exemplifies" as const, direction: "forward" as const, weight: 0.5 }, + ], + }; + const result = validateGraph(graph); + const fatals = result.issues.filter((i) => i.level === "fatal"); + expect(fatals).toHaveLength(0); + }); + + it("resolves knowledge node type aliases", () => { + const graph = { + ...minimalKnowledgeGraph, + nodes: [ + { id: "note:n1", type: "note" as any, name: "A Note", summary: "note alias", tags: [], complexity: "simple" }, + { id: "person:p1", type: "person" as any, name: "A Person", summary: "person alias", tags: [], complexity: "simple" }, + ], + edges: [], + layers: [], + }; + const result = validateGraph(graph); + const noteNode = result.data!.nodes.find((n) => n.id === "note:n1"); + const personNode = result.data!.nodes.find((n) => n.id === "person:p1"); + expect(noteNode?.type).toBe("article"); + expect(personNode?.type).toBe("entity"); + }); + + it("resolves knowledge edge type aliases", () => { + const graph = { + ...minimalKnowledgeGraph, + edges: [ + { source: "article:test-note", target: "entity:karpathy", type: "written_by" as any, direction: "forward", weight: 0.8 }, + ], + }; + const result = validateGraph(graph); + const edge = result.data!.edges.find((e) => e.source === "article:test-note" && e.target === "entity:karpathy"); + expect(edge?.type).toBe("authored_by"); + }); + + it("validates knowledgeMeta fields", () => { + const graph = { + ...minimalKnowledgeGraph, + nodes: [ + { + id: "article:with-meta", + type: "article" as const, + name: "Article with meta", + summary: "Has knowledge metadata", + tags: ["test"], + complexity: "simple" as const, + knowledgeMeta: { + format: "obsidian" as const, + wikilinks: ["[[other-note]]", "[[another]]"], + backlinks: ["article:from-here"], + frontmatter: { title: "My Note", tags: ["ai"] }, + sourceUrl: "https://example.com", + confidence: 0.85, + }, + }, + ], + edges: [], + layers: [], + }; + const result = validateGraph(graph); + const fatals = result.issues.filter((i) => i.level === "fatal"); + expect(fatals).toHaveLength(0); + const node = result.data!.nodes.find((n) => n.id === "article:with-meta"); + expect(node?.knowledgeMeta?.format).toBe("obsidian"); + expect(node?.knowledgeMeta?.confidence).toBe(0.85); + }); + + it("rejects out-of-range confidence values", () => { + const graph = { + ...minimalKnowledgeGraph, + nodes: [ + { + id: "article:bad-conf", + type: "article" as const, + name: "Bad confidence", + summary: "Out of range", + tags: [], + complexity: "simple" as const, + knowledgeMeta: { confidence: 1.5 }, + }, + ], + edges: [], + layers: [], + }; + const result = validateGraph(graph); + // Node should be dropped or have an issue due to invalid confidence + const issues = result.issues.filter((i) => i.message.includes("bad-conf") || i.message.includes("confidence")); + expect(issues.length).toBeGreaterThan(0); + }); +}); diff --git a/understand-anything-plugin/packages/core/src/schema.ts b/understand-anything-plugin/packages/core/src/schema.ts index 88790a8..2d9d1bc 100644 --- a/understand-anything-plugin/packages/core/src/schema.ts +++ b/understand-anything-plugin/packages/core/src/schema.ts @@ -1,6 +1,6 @@ import { z } from "zod"; -// Edge types (29 values across 7 categories) +// Edge types (35 values across 8 categories) export const EdgeTypeSchema = z.enum([ "imports", "exports", "contains", "inherits", "implements", // Structural "calls", "subscribes", "publishes", "middleware", // Behavioral @@ -10,6 +10,7 @@ export const EdgeTypeSchema = z.enum([ "deploys", "serves", "provisions", "triggers", // Infrastructure "migrates", "documents", "routes", "defines_schema", // Schema/Data "contains_flow", "flow_step", "cross_domain", // Domain + "cites", "contradicts", "builds_on", "exemplifies", "categorized_under", "authored_by", // Knowledge ]); // Aliases that LLMs commonly generate instead of canonical node types @@ -55,6 +56,30 @@ export const NODE_TYPE_ALIASES: Record = { business_process: "flow", task: "step", business_step: "step", + // Knowledge aliases + note: "article", + page: "article", + post: "article", + wiki_page: "article", + person: "entity", + place: "entity", + thing: "entity", + tool: "entity", + paper: "source", + organization: "entity", + org: "entity", + tag: "topic", + category: "topic", + theme: "topic", + tag_topic: "topic", + assertion: "claim", + insight: "claim", + takeaway: "claim", + hypothesis: "claim", + reference: "source", + raw: "source", + citation: "source", + bibliography: "source", }; // Aliases that LLMs commonly generate instead of canonical edge types @@ -88,6 +113,27 @@ export const EDGE_TYPE_ALIASES: Record = { has_flow: "contains_flow", next_step: "flow_step", interacts_with: "cross_domain", + // Knowledge aliases + references: "cites", + // Note: "cited_by" and "sourced_from" are intentionally NOT aliased to "cites" — + // they invert edge direction (A cited_by B means B cites A, not A cites B). + // The LLM should use "cites" with correct source/target instead. + opposes: "contradicts", + conflicts_with: "contradicts", + disagrees_with: "contradicts", + elaborates: "builds_on", + refines: "builds_on", + deepens: "builds_on", + illustrates: "exemplifies", + demonstrates: "exemplifies", + example_of: "exemplifies", + instance_of: "exemplifies", + tagged_with: "categorized_under", + classified_as: "categorized_under", + belongs_to: "categorized_under", + part_of: "categorized_under", + written_by: "authored_by", + created_by: "authored_by", // Note: "implemented_by" is intentionally NOT aliased to "implements" — // it inverts edge direction (see commit fd0df15). The LLM should use // "implements" with correct source/target instead. @@ -327,6 +373,15 @@ const DomainMetaSchema = z.object({ entryType: z.enum(["http", "cli", "event", "cron", "manual"]).optional(), }).passthrough(); +const KnowledgeMetaSchema = z.object({ + format: z.enum(["obsidian", "logseq", "dendron", "foam", "karpathy", "zettelkasten", "plain"]).optional(), + wikilinks: z.array(z.string()).optional(), + backlinks: z.array(z.string()).optional(), + frontmatter: z.record(z.string(), z.unknown()).optional(), + sourceUrl: z.string().optional(), + confidence: z.number().min(0).max(1).optional(), +}).passthrough(); + export const GraphNodeSchema = z.object({ id: z.string(), type: z.enum([ @@ -334,6 +389,7 @@ export const GraphNodeSchema = z.object({ "config", "document", "service", "table", "endpoint", "pipeline", "schema", "resource", "domain", "flow", "step", + "article", "entity", "topic", "claim", "source", ]), name: z.string(), filePath: z.string().optional(), @@ -343,6 +399,7 @@ export const GraphNodeSchema = z.object({ complexity: z.enum(["simple", "moderate", "complex"]), languageNotes: z.string().optional(), domainMeta: DomainMetaSchema.optional(), + knowledgeMeta: KnowledgeMetaSchema.optional(), }).passthrough(); export const GraphEdgeSchema = z.object({ @@ -380,6 +437,7 @@ export const ProjectMetaSchema = z.object({ export const KnowledgeGraphSchema = z.object({ version: z.string(), + kind: z.enum(["codebase", "knowledge"]).optional(), project: ProjectMetaSchema, nodes: z.array(GraphNodeSchema), edges: z.array(GraphEdgeSchema), @@ -609,8 +667,24 @@ export function validateGraph(data: unknown): ValidationResult { } } + let validatedKind: "codebase" | "knowledge" | undefined; + if (typeof fixed.kind === "string") { + if (fixed.kind === "codebase" || fixed.kind === "knowledge") { + validatedKind = fixed.kind; + } else { + validatedKind = undefined; + issues.push({ + level: "auto-corrected", + category: "invalid-enum", + message: `kind "${fixed.kind}" is not valid — removed (must be "codebase" or "knowledge")`, + path: "kind", + }); + } + } + const graph = { version: typeof fixed.version === "string" ? fixed.version : "1.0.0", + kind: validatedKind, project: projectResult.data, nodes: validNodes, edges: validEdges, diff --git a/understand-anything-plugin/packages/core/src/types.ts b/understand-anything-plugin/packages/core/src/types.ts index 106f25e..95dbe45 100644 --- a/understand-anything-plugin/packages/core/src/types.ts +++ b/understand-anything-plugin/packages/core/src/types.ts @@ -1,11 +1,12 @@ -// Node types (16 total: 5 code + 8 non-code + 3 domain) +// Node types (21 total: 5 code + 8 non-code + 3 domain + 5 knowledge) export type NodeType = | "file" | "function" | "class" | "module" | "concept" | "config" | "document" | "service" | "table" | "endpoint" | "pipeline" | "schema" | "resource" - | "domain" | "flow" | "step"; + | "domain" | "flow" | "step" + | "article" | "entity" | "topic" | "claim" | "source"; -// Edge types (29 total in 7 categories: Structural, Behavioral, Data flow, Dependencies, Semantic, Infrastructure/Schema, Domain) +// Edge types (35 total in 8 categories: Structural, Behavioral, Data flow, Dependencies, Semantic, Infrastructure/Schema, Domain, Knowledge) export type EdgeType = | "imports" | "exports" | "contains" | "inherits" | "implements" // Structural | "calls" | "subscribes" | "publishes" | "middleware" // Behavioral @@ -14,7 +15,8 @@ export type EdgeType = | "related" | "similar_to" // Semantic | "deploys" | "serves" | "provisions" | "triggers" // Infrastructure | "migrates" | "documents" | "routes" | "defines_schema" // Schema/Data - | "contains_flow" | "flow_step" | "cross_domain"; // Domain + | "contains_flow" | "flow_step" | "cross_domain" // Domain + | "cites" | "contradicts" | "builds_on" | "exemplifies" | "categorized_under" | "authored_by"; // Knowledge // Optional domain metadata for domain/flow/step nodes export interface DomainMeta { @@ -25,7 +27,17 @@ export interface DomainMeta { entryType?: "http" | "cli" | "event" | "cron" | "manual"; } -// GraphNode with 16 types: 5 code + 8 non-code + 3 domain +// Optional knowledge metadata for article/entity/topic/claim/source nodes +export interface KnowledgeMeta { + format?: "obsidian" | "logseq" | "dendron" | "foam" | "karpathy" | "zettelkasten" | "plain"; + wikilinks?: string[]; + backlinks?: string[]; + frontmatter?: Record; + sourceUrl?: string; + confidence?: number; // 0-1, for LLM-inferred relationships +} + +// GraphNode with 21 types: 5 code + 8 non-code + 3 domain + 5 knowledge export interface GraphNode { id: string; type: NodeType; @@ -37,6 +49,7 @@ export interface GraphNode { complexity: "simple" | "moderate" | "complex"; languageNotes?: string; domainMeta?: DomainMeta; + knowledgeMeta?: KnowledgeMeta; } // GraphEdge with rich relationship modeling @@ -79,6 +92,7 @@ export interface ProjectMeta { // Root KnowledgeGraph export interface KnowledgeGraph { version: string; + kind?: "codebase" | "knowledge"; // undefined defaults to "codebase" for backward compat project: ProjectMeta; nodes: GraphNode[]; edges: GraphEdge[]; diff --git a/understand-anything-plugin/packages/dashboard/src/App.tsx b/understand-anything-plugin/packages/dashboard/src/App.tsx index c6e262e..141aee4 100644 --- a/understand-anything-plugin/packages/dashboard/src/App.tsx +++ b/understand-anything-plugin/packages/dashboard/src/App.tsx @@ -7,6 +7,8 @@ import DomainGraphView from "./components/DomainGraphView"; import CodeViewer from "./components/CodeViewer"; import SearchBar from "./components/SearchBar"; import NodeInfo from "./components/NodeInfo"; +import KnowledgeInfo from "./components/KnowledgeInfo"; +import ReadingPanel from "./components/ReadingPanel"; import LayerLegend from "./components/LayerLegend"; import DiffToggle from "./components/DiffToggle"; import FilterPanel from "./components/FilterPanel"; @@ -313,7 +315,8 @@ function Dashboard({ accessToken }: { accessToken: string }) { const isLearnMode = tourActive || persona === "junior"; const sidebarContent = ( <> - {selectedNodeId && } + {selectedNodeId && graph?.kind === "knowledge" && } + {selectedNodeId && graph?.kind !== "knowledge" && } {isLearnMode && } {!selectedNodeId && !isLearnMode && } @@ -376,6 +379,7 @@ function Dashboard({ accessToken }: { accessToken: string }) { { key: "infra", label: "Infra", color: "var(--color-node-service)" }, { key: "data", label: "Data", color: "var(--color-node-table)" }, { key: "domain", label: "Domain", color: "var(--color-node-concept)" }, + { key: "knowledge", label: "Knowledge", color: "var(--color-node-document)" }, ] as const).map((cat) => (
)} + + {/* Reading panel overlay for knowledge graph articles */} + {!codeViewerOpen && } {/* Keyboard shortcuts help modal */} diff --git a/understand-anything-plugin/packages/dashboard/src/components/CustomNode.tsx b/understand-anything-plugin/packages/dashboard/src/components/CustomNode.tsx index bc06c97..01c912f 100644 --- a/understand-anything-plugin/packages/dashboard/src/components/CustomNode.tsx +++ b/understand-anything-plugin/packages/dashboard/src/components/CustomNode.tsx @@ -21,6 +21,11 @@ const typeColors: Record = { domain: "var(--color-node-concept)", flow: "var(--color-node-pipeline)", step: "var(--color-node-function)", + article: "var(--color-node-article)", + entity: "var(--color-node-entity)", + topic: "var(--color-node-topic)", + claim: "var(--color-node-claim)", + source: "var(--color-node-source)", }; const typeTextColors: Record = { @@ -40,6 +45,11 @@ const typeTextColors: Record = { domain: "text-node-concept", flow: "text-node-pipeline", step: "text-node-function", + article: "text-node-article", + entity: "text-node-entity", + topic: "text-node-topic", + claim: "text-node-claim", + source: "text-node-source", }; const complexityColors: Record = { diff --git a/understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx b/understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx index 82c4ce3..25027d6 100644 --- a/understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx +++ b/understand-anything-plugin/packages/dashboard/src/components/GraphView.tsx @@ -50,6 +50,15 @@ import type { NodeCategory } from "../store"; * Maps each NodeType to a filter category. Must be kept in sync with core NodeType. * Unknown types default to "code" with a development warning. */ +const KNOWLEDGE_EDGE_STYLES: Record = { + cites: { strokeDasharray: "6 3", strokeWidth: 1.5 }, + contradicts: { stroke: "#c97070", strokeWidth: 2 }, + builds_on: { stroke: "var(--color-accent)", strokeWidth: 2 }, + categorized_under: { stroke: "rgba(150,150,150,0.5)", strokeWidth: 1 }, + authored_by: { strokeDasharray: "3 3", stroke: "var(--color-node-entity)", strokeWidth: 1.5 }, + exemplifies: { strokeDasharray: "3 3", stroke: "var(--color-node-claim)", strokeWidth: 1.5 }, +}; + const NODE_TYPE_TO_CATEGORY: Record = { file: "code", function: "code", class: "code", module: "code", concept: "code", config: "config", @@ -57,6 +66,7 @@ const NODE_TYPE_TO_CATEGORY: Record = { service: "infra", resource: "infra", pipeline: "infra", table: "data", endpoint: "data", schema: "data", domain: "domain", flow: "domain", step: "domain", + article: "knowledge", entity: "knowledge", topic: "knowledge", claim: "knowledge", source: "knowledge", } as const; // ── Helper components that must live inside ──────────────── @@ -212,6 +222,7 @@ function useOverviewGraph() { */ function useLayerDetailTopology() { const graph = useDashboardStore((s) => s.graph); + const graphKind = useDashboardStore((s) => s.graph?.kind); const activeLayerId = useDashboardStore((s) => s.activeLayerId); const selectNode = useDashboardStore((s) => s.selectNode); const persona = useDashboardStore((s) => s.persona); @@ -256,6 +267,7 @@ function useLayerDetailTopology() { "config", "document", "service", "table", "endpoint", "pipeline", "schema", "resource", "domain", "flow", "step", + "article", "entity", "topic", "claim", "source", "function", "class", ]); @@ -352,6 +364,14 @@ function useLayerDetailTopology() { edgeAnimated = edge.type === "calls"; } + // Apply knowledge-specific edge styles + if (graphKind === "knowledge" && !diffMode) { + const knowledgeStyle = KNOWLEDGE_EDGE_STYLES[edge.type]; + if (knowledgeStyle) { + edgeStyle = { ...edgeStyle, ...knowledgeStyle }; + } + } + return { id: `e-${i}`, source: edge.source, @@ -478,7 +498,7 @@ function useLayerDetailGraph() { return topo.edges.map((edge) => { const isSelectedEdge = edge.source === selectedNodeId || edge.target === selectedNodeId; // Don't restyle diff-impacted or portal edges - if ((edge.style as Record)?.strokeDasharray) return edge; + if (edge.target.startsWith("portal:")) return edge; if (isSelectedEdge) { return { ...edge, animated: true, style: { stroke: "rgba(212,165,116,0.8)", strokeWidth: 2.5 }, labelStyle: { fill: "#d4a574", fontSize: 11, fontWeight: 600 } }; diff --git a/understand-anything-plugin/packages/dashboard/src/components/KnowledgeInfo.tsx b/understand-anything-plugin/packages/dashboard/src/components/KnowledgeInfo.tsx new file mode 100644 index 0000000..9f5be71 --- /dev/null +++ b/understand-anything-plugin/packages/dashboard/src/components/KnowledgeInfo.tsx @@ -0,0 +1,473 @@ +import { useDashboardStore } from "../store"; +import type { GraphNode } from "@understand-anything/core/types"; + +const typeBadgeColors: Record = { + article: "text-node-article border border-node-article/30 bg-node-article/10", + entity: "text-node-entity border border-node-entity/30 bg-node-entity/10", + topic: "text-node-topic border border-node-topic/30 bg-node-topic/10", + claim: "text-node-claim border border-node-claim/30 bg-node-claim/10", + source: "text-node-source border border-node-source/30 bg-node-source/10", +}; + +export default function KnowledgeInfo() { + const graph = useDashboardStore((s) => s.graph); + const selectedNodeId = useDashboardStore((s) => s.selectedNodeId); + const nodeHistory = useDashboardStore((s) => s.nodeHistory); + const goBackNode = useDashboardStore((s) => s.goBackNode); + const navigateToNode = useDashboardStore((s) => s.navigateToNode); + const navigateToHistoryIndex = useDashboardStore((s) => s.navigateToHistoryIndex); + const setFocusNode = useDashboardStore((s) => s.setFocusNode); + const focusNodeId = useDashboardStore((s) => s.focusNodeId); + const viewMode = useDashboardStore((s) => s.viewMode); + const domainGraph = useDashboardStore((s) => s.domainGraph); + + const activeGraph = viewMode === "domain" && domainGraph ? domainGraph : graph; + const node = activeGraph?.nodes.find((n) => n.id === selectedNodeId) ?? null; + + const historyNodes = nodeHistory.map((id) => { + const n = activeGraph?.nodes.find((gn) => gn.id === id); + return { id, name: n?.name ?? id }; + }); + + if (!node) { + return ( +
+

Select a node to see details

+
+ ); + } + + const allEdges = activeGraph?.edges ?? []; + + // Backlinks: edges where this node is the target + const backlinks = allEdges + .filter((e) => e.target === node.id) + .map((e) => { + const sourceNode = activeGraph?.nodes.find((n) => n.id === e.source); + return { id: e.source, name: sourceNode?.name ?? e.source, type: e.type, node: sourceNode }; + }); + + // Outgoing links: edges where this node is the source + const outgoing = allEdges + .filter((e) => e.source === node.id) + .map((e) => { + const targetNode = activeGraph?.nodes.find((n) => n.id === e.target); + return { id: e.target, name: targetNode?.name ?? e.target, type: e.type, node: targetNode }; + }); + + const typeBadge = typeBadgeColors[node.type] ?? typeBadgeColors.article; + const meta = node.knowledgeMeta; + + return ( +
+ {/* Navigation history trail */} + {historyNodes.length > 0 && ( +
+ + | + {historyNodes.slice(-3).map((h, i, arr) => ( + + + {i < arr.length - 1 && ( + + )} + + ))} + + + {node.name} + +
+ )} + + {/* Type badge */} +
+ + {node.type} + + {/* Entity sub-type tags */} + {node.type === "entity" && node.tags.length > 0 && ( +
+ {node.tags + .filter((t) => ["person", "tool", "paper", "org"].includes(t.toLowerCase())) + .map((t) => ( + + {t} + + ))} +
+ )} +
+ + {/* Name */} +
+

{node.name}

+ +
+ + {/* Summary */} +

+ {node.summary} +

+ + {/* Source URL (for source nodes) */} + {node.type === "source" && meta?.sourceUrl && ( +
+

+ Source URL +

+ + {meta.sourceUrl} + +
+ )} + + {/* Confidence score bar (for claim nodes) */} + {node.type === "claim" && meta?.confidence != null && ( +
+

+ Confidence +

+
+
+
= 0.7 + ? "var(--color-node-function)" + : meta.confidence >= 0.4 + ? "var(--color-accent-dim)" + : "#c97070", + }} + /> +
+ + {Math.round(meta.confidence * 100)}% + +
+
+ )} + + {/* Frontmatter (for article nodes) */} + {node.type === "article" && meta?.frontmatter && Object.keys(meta.frontmatter).length > 0 && ( +
+

+ Frontmatter +

+
+ {Object.entries(meta.frontmatter).map(([key, value]) => ( +
+ {key}: + + {typeof value === "object" ? JSON.stringify(value) : String(value)} + +
+ ))} +
+
+ )} + + {/* Tags */} + {node.tags.length > 0 && ( +
+

+ Tags +

+
+ {node.tags.map((tag) => ( + + {tag} + + ))} +
+
+ )} + + {/* Backlinks */} + {backlinks.length > 0 && ( +
+

+ Backlinks ({backlinks.length}) +

+
+ {backlinks.map((link, i) => { + const linkTypeBadge = link.node + ? typeBadgeColors[link.node.type] ?? typeBadgeColors.article + : typeBadgeColors.article; + return ( +
navigateToNode(link.id)} + > + + {link.node && ( + + {link.node.type} + + )} + {link.name} + {link.type.replace(/_/g, " ")} +
+ ); + })} +
+
+ )} + + {/* Outgoing links */} + {outgoing.length > 0 && ( +
+

+ Outgoing Links ({outgoing.length}) +

+
+ {outgoing.map((link, i) => { + const linkTypeBadge = link.node + ? typeBadgeColors[link.node.type] ?? typeBadgeColors.article + : typeBadgeColors.article; + return ( +
navigateToNode(link.id)} + > + + {link.node && ( + + {link.node.type} + + )} + {link.name} + {link.type.replace(/_/g, " ")} +
+ ); + })} +
+
+ )} + + {/* Type-specific: articles referencing this entity */} + {node.type === "entity" && (() => { + const referencingArticles = allEdges + .filter((e) => (e.target === node.id || e.source === node.id) && e.type !== "related") + .map((e) => { + const otherId = e.source === node.id ? e.target : e.source; + return activeGraph?.nodes.find((n) => n.id === otherId); + }) + .filter((n): n is GraphNode => n !== undefined && n.type === "article"); + if (referencingArticles.length === 0) return null; + return ( +
+

+ Referenced In ({referencingArticles.length}) +

+
+ {referencingArticles.map((a) => ( +
navigateToNode(a.id)} + > + {a.name} +
+ ))} +
+
+ ); + })()} + + {/* Type-specific: related entities (for entity nodes) */} + {node.type === "entity" && (() => { + const relatedEntities = allEdges + .filter((e) => + (e.source === node.id || e.target === node.id) && + (e.type === "related" || e.type === "similar_to"), + ) + .map((e) => { + const otherId = e.source === node.id ? e.target : e.source; + return activeGraph?.nodes.find((n) => n.id === otherId); + }) + .filter((n): n is GraphNode => n !== undefined && n.type === "entity"); + if (relatedEntities.length === 0) return null; + return ( +
+

+ Related Entities ({relatedEntities.length}) +

+
+ {relatedEntities.map((e) => ( +
navigateToNode(e.id)} + > + {e.name} +
+ ))} +
+
+ ); + })()} + + {/* Type-specific: articles under topic */} + {node.type === "topic" && (() => { + const categorizedArticles = allEdges + .filter((e) => e.type === "categorized_under" && e.target === node.id) + .map((e) => activeGraph?.nodes.find((n) => n.id === e.source)) + .filter((n): n is GraphNode => n !== undefined && n.type === "article"); + if (categorizedArticles.length === 0) return null; + return ( +
+

+ Articles ({categorizedArticles.length}) +

+
+ {categorizedArticles.map((a) => ( +
navigateToNode(a.id)} + > + {a.name} +
+ ))} +
+
+ ); + })()} + + {/* Type-specific: contradicting claims */} + {node.type === "claim" && (() => { + const contradictions = allEdges + .filter((e) => + e.type === "contradicts" && + (e.source === node.id || e.target === node.id), + ) + .map((e) => { + const otherId = e.source === node.id ? e.target : e.source; + return activeGraph?.nodes.find((n) => n.id === otherId); + }) + .filter((n): n is GraphNode => n !== undefined); + if (contradictions.length === 0) return null; + return ( +
+

+ Contradicting Claims ({contradictions.length}) +

+
+ {contradictions.map((c) => ( +
navigateToNode(c.id)} + > + {c.name} +
+ ))} +
+
+ ); + })()} + + {/* Type-specific: supporting articles for claims */} + {node.type === "claim" && (() => { + const supporting = allEdges + .filter((e) => + (e.type === "cites" || e.type === "builds_on" || e.type === "exemplifies") && + (e.source === node.id || e.target === node.id), + ) + .map((e) => { + const otherId = e.source === node.id ? e.target : e.source; + return activeGraph?.nodes.find((n) => n.id === otherId); + }) + .filter((n): n is GraphNode => n !== undefined && n.type === "article"); + if (supporting.length === 0) return null; + return ( +
+

+ Supporting Articles ({supporting.length}) +

+
+ {supporting.map((a) => ( +
navigateToNode(a.id)} + > + {a.name} +
+ ))} +
+
+ ); + })()} + + {/* Type-specific: articles citing this source */} + {node.type === "source" && (() => { + const citingArticles = allEdges + .filter((e) => e.type === "cites" && e.target === node.id) + .map((e) => activeGraph?.nodes.find((n) => n.id === e.source)) + .filter((n): n is GraphNode => n !== undefined); + if (citingArticles.length === 0) return null; + return ( +
+

+ Cited By ({citingArticles.length}) +

+
+ {citingArticles.map((a) => ( +
navigateToNode(a.id)} + > + {a.name} +
+ ))} +
+
+ ); + })()} +
+ ); +} diff --git a/understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx b/understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx index d8780ad..6a516da 100644 --- a/understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx +++ b/understand-anything-plugin/packages/dashboard/src/components/NodeInfo.tsx @@ -20,6 +20,11 @@ const typeBadgeColors: Record = { domain: "text-node-concept border border-node-concept/30 bg-node-concept/10", flow: "text-node-pipeline border border-node-pipeline/30 bg-node-pipeline/10", step: "text-node-function border border-node-function/30 bg-node-function/10", + article: "text-node-article border border-node-article/30 bg-node-article/10", + entity: "text-node-entity border border-node-entity/30 bg-node-entity/10", + topic: "text-node-topic border border-node-topic/30 bg-node-topic/10", + claim: "text-node-claim border border-node-claim/30 bg-node-claim/10", + source: "text-node-source border border-node-source/30 bg-node-source/10", }; const complexityBadgeColors: Record = { @@ -29,7 +34,7 @@ const complexityBadgeColors: Record = { }; /** - * Human-readable directional labels for all 29 edge types. + * Human-readable directional labels for all 35 edge types. * Must be kept in sync with core EdgeType. */ const EDGE_LABELS: Record = { @@ -62,6 +67,12 @@ const EDGE_LABELS: Record = { contains_flow: { forward: "contains flow", backward: "flow in" }, flow_step: { forward: "flow step", backward: "step of" }, cross_domain: { forward: "cross-domain to", backward: "cross-domain from" }, + cites: { forward: "cites", backward: "cited by" }, + contradicts: { forward: "contradicts", backward: "contradicted by" }, + builds_on: { forward: "builds on", backward: "built upon by" }, + exemplifies: { forward: "exemplifies", backward: "exemplified by" }, + categorized_under: { forward: "categorized under", backward: "categorizes" }, + authored_by: { forward: "authored by", backward: "authored" }, }; /** diff --git a/understand-anything-plugin/packages/dashboard/src/components/ProjectOverview.tsx b/understand-anything-plugin/packages/dashboard/src/components/ProjectOverview.tsx index 439452b..ae6ef48 100644 --- a/understand-anything-plugin/packages/dashboard/src/components/ProjectOverview.tsx +++ b/understand-anything-plugin/packages/dashboard/src/components/ProjectOverview.tsx @@ -13,6 +13,7 @@ export default function ProjectOverview() { } const { project, nodes, edges, layers } = graph; + const isKnowledge = graph.kind === "knowledge"; const hasTour = graph.tour.length > 0; // Count node types @@ -45,7 +46,23 @@ export default function ProjectOverview() { const avgConnections = nodes.length > 0 ? (edges.length * 2 / nodes.length).toFixed(1) : "0"; - // Category breakdowns + // Detect knowledge format from first node with knowledgeMeta + const knowledgeFormat = isKnowledge + ? nodes.find((n) => n.knowledgeMeta?.format)?.knowledgeMeta?.format ?? "plain" + : undefined; + + // Knowledge-specific stats + const knowledgeStats = isKnowledge + ? [ + { label: "Articles", color: "var(--color-node-article, var(--color-accent))", count: typeCounts["article"] ?? 0 }, + { label: "Entities", color: "var(--color-node-entity, #7eb8da)", count: typeCounts["entity"] ?? 0 }, + { label: "Topics", color: "var(--color-node-topic, #a78bfa)", count: typeCounts["topic"] ?? 0 }, + { label: "Claims", color: "var(--color-node-claim, #f0abfc)", count: typeCounts["claim"] ?? 0 }, + { label: "Sources", color: "var(--color-node-source, #86efac)", count: typeCounts["source"] ?? 0 }, + ] + : []; + + // Category breakdowns (codebase mode) const categoryBreakdown = [ { label: "Code", color: "var(--color-node-file)", count: (typeCounts["file"] ?? 0) + (typeCounts["function"] ?? 0) + (typeCounts["class"] ?? 0) + (typeCounts["module"] ?? 0) + (typeCounts["concept"] ?? 0) }, { label: "Config", color: "var(--color-node-config)", count: typeCounts["config"] ?? 0 }, @@ -62,6 +79,15 @@ export default function ProjectOverview() {

{project.name}

{project.description}

+ {/* Knowledge format badge */} + {isKnowledge && knowledgeFormat && ( +
+ + {knowledgeFormat} format + +
+ )} + {/* Stats grid */}
@@ -82,8 +108,27 @@ export default function ProjectOverview() {
- {/* File Types breakdown */} - {hasNonCodeNodes && ( + {/* Knowledge-specific stats */} + {isKnowledge && ( +
+

Knowledge Breakdown

+
+ {knowledgeStats.filter((s) => s.count > 0).map((stat) => ( +
+ + {stat.label} + {stat.count} +
+ ))} +
+
+ )} + + {/* File Types breakdown (codebase only) */} + {!isKnowledge && hasNonCodeNodes && (

File Types

@@ -101,8 +146,8 @@ export default function ProjectOverview() {
)} - {/* Languages */} - {project.languages.length > 0 && ( + {/* Languages (codebase only) */} + {!isKnowledge && project.languages.length > 0 && (

Languages

@@ -115,8 +160,8 @@ export default function ProjectOverview() {
)} - {/* Frameworks */} - {project.frameworks.length > 0 && ( + {/* Frameworks (codebase only) */} + {!isKnowledge && project.frameworks.length > 0 && (

Frameworks

diff --git a/understand-anything-plugin/packages/dashboard/src/components/ReadingPanel.tsx b/understand-anything-plugin/packages/dashboard/src/components/ReadingPanel.tsx new file mode 100644 index 0000000..2d87248 --- /dev/null +++ b/understand-anything-plugin/packages/dashboard/src/components/ReadingPanel.tsx @@ -0,0 +1,142 @@ +import { useState } from "react"; +import { useDashboardStore } from "../store"; +export default function ReadingPanel() { + const graph = useDashboardStore((s) => s.graph); + const selectedNodeId = useDashboardStore((s) => s.selectedNodeId); + const navigateToNode = useDashboardStore((s) => s.navigateToNode); + const [expanded, setExpanded] = useState(false); + + // Only render for knowledge graphs with an article node selected + if (graph?.kind !== "knowledge" || !selectedNodeId) return null; + + const node = graph.nodes.find((n) => n.id === selectedNodeId) ?? null; + if (!node || node.type !== "article") return null; + + const meta = node.knowledgeMeta; + const allEdges = graph.edges; + + // Backlinks: edges where this node is the target + const backlinks = allEdges + .filter((e) => e.target === node.id) + .map((e) => { + const sourceNode = graph.nodes.find((n) => n.id === e.source); + return { id: e.source, name: sourceNode?.name ?? e.source, node: sourceNode }; + }); + + const panelHeight = expanded ? "70vh" : "45vh"; + + return ( +
+
+ {/* Header bar */} +
+ + Reading + + + {node.name} + +
+ {/* Expand/collapse toggle */} + + {/* Close button */} + +
+
+ + {/* Content area */} +
+ {/* Main content */} +
+

{node.name}

+ + {/* Tags */} + {node.tags.length > 0 && ( +
+ {node.tags.map((tag) => ( + + {tag} + + ))} +
+ )} + + {/* Summary / article content */} +
+

+ {node.summary} +

+
+ + {/* Frontmatter metadata card */} + {meta?.frontmatter && Object.keys(meta.frontmatter).length > 0 && ( +
+

+ Metadata +

+
+ {Object.entries(meta.frontmatter).map(([key, value]) => ( +
+ {key} + + {typeof value === "object" ? JSON.stringify(value) : String(value)} + +
+ ))} +
+
+ )} +
+ + {/* Right sidebar: backlinks */} + {backlinks.length > 0 && ( +
+

+ Backlinks ({backlinks.length}) +

+
+ {backlinks.map((link, i) => ( + + ))} +
+
+ )} +
+
+
+ ); +} diff --git a/understand-anything-plugin/packages/dashboard/src/index.css b/understand-anything-plugin/packages/dashboard/src/index.css index 03690b6..66f639f 100644 --- a/understand-anything-plugin/packages/dashboard/src/index.css +++ b/understand-anything-plugin/packages/dashboard/src/index.css @@ -36,6 +36,13 @@ --color-node-schema: #fcd34d; --color-node-resource: #a5b4fc; + /* Knowledge node colors */ + --color-node-article: #d4a574; + --color-node-entity: #7ba4c9; + --color-node-topic: #c9b06c; + --color-node-claim: #6fb07a; + --color-node-source: #8a8a8a; + /* Diff */ --color-diff-changed: #e05252; --color-diff-affected: #d4a030; diff --git a/understand-anything-plugin/packages/dashboard/src/store.ts b/understand-anything-plugin/packages/dashboard/src/store.ts index 17bd31e..5c12eff 100644 --- a/understand-anything-plugin/packages/dashboard/src/store.ts +++ b/understand-anything-plugin/packages/dashboard/src/store.ts @@ -9,10 +9,10 @@ import type { ReactFlowInstance } from "@xyflow/react"; export type Persona = "non-technical" | "junior" | "experienced"; export type NavigationLevel = "overview" | "layer-detail"; -export type NodeType = "file" | "function" | "class" | "module" | "concept" | "config" | "document" | "service" | "table" | "endpoint" | "pipeline" | "schema" | "resource" | "domain" | "flow" | "step"; +export type NodeType = "file" | "function" | "class" | "module" | "concept" | "config" | "document" | "service" | "table" | "endpoint" | "pipeline" | "schema" | "resource" | "domain" | "flow" | "step" | "article" | "entity" | "topic" | "claim" | "source"; export type Complexity = "simple" | "moderate" | "complex"; -export type EdgeCategory = "structural" | "behavioral" | "data-flow" | "dependencies" | "semantic" | "infrastructure" | "domain"; -export type ViewMode = "structural" | "domain"; +export type EdgeCategory = "structural" | "behavioral" | "data-flow" | "dependencies" | "semantic" | "infrastructure" | "domain" | "knowledge"; +export type ViewMode = "structural" | "domain" | "knowledge"; export interface FilterState { nodeTypes: Set; @@ -21,9 +21,9 @@ export interface FilterState { edgeCategories: Set; } -export const ALL_NODE_TYPES: NodeType[] = ["file", "function", "class", "module", "concept", "config", "document", "service", "table", "endpoint", "pipeline", "schema", "resource", "domain", "flow", "step"]; +export const ALL_NODE_TYPES: NodeType[] = ["file", "function", "class", "module", "concept", "config", "document", "service", "table", "endpoint", "pipeline", "schema", "resource", "domain", "flow", "step", "article", "entity", "topic", "claim", "source"]; export const ALL_COMPLEXITIES: Complexity[] = ["simple", "moderate", "complex"]; -export const ALL_EDGE_CATEGORIES: EdgeCategory[] = ["structural", "behavioral", "data-flow", "dependencies", "semantic", "infrastructure", "domain"]; +export const ALL_EDGE_CATEGORIES: EdgeCategory[] = ["structural", "behavioral", "data-flow", "dependencies", "semantic", "infrastructure", "domain", "knowledge"]; export const EDGE_CATEGORY_MAP: Record = { structural: ["imports", "exports", "contains", "inherits", "implements"], @@ -33,6 +33,7 @@ export const EDGE_CATEGORY_MAP: Record = { semantic: ["related", "similar_to"], infrastructure: ["deploys", "serves", "provisions", "triggers", "migrates", "documents", "routes", "defines_schema"], domain: ["contains_flow", "flow_step", "cross_domain"], + knowledge: ["cites", "contradicts", "builds_on", "exemplifies", "categorized_under", "authored_by"], }; export const DOMAIN_EDGE_TYPES = EDGE_CATEGORY_MAP.domain; @@ -45,7 +46,7 @@ const DEFAULT_FILTERS: FilterState = { }; /** Categories used for node type filter toggles. Single source of truth for NodeCategory. */ -export type NodeCategory = "code" | "config" | "docs" | "infra" | "data" | "domain"; +export type NodeCategory = "code" | "config" | "docs" | "infra" | "data" | "domain" | "knowledge"; /** Find which layer a node belongs to. Returns layerId or null. */ function findNodeLayer(graph: KnowledgeGraph, nodeId: string): string | null { @@ -197,7 +198,7 @@ export const useDashboardStore = create()((set, get) => ({ pathFinderOpen: false, reactFlowInstance: null, - nodeTypeFilters: { code: true, config: true, docs: true, infra: true, data: true, domain: true }, + nodeTypeFilters: { code: true, config: true, docs: true, infra: true, data: true, domain: true, knowledge: true }, toggleNodeTypeFilter: (category) => set((state) => ({ @@ -223,7 +224,7 @@ export const useDashboardStore = create()((set, get) => ({ selectedNodeId: null, focusNodeId: null, nodeHistory: [], - viewMode: keepDomainView ? "domain" as const : "structural" as const, + viewMode: keepDomainView ? "domain" as const : graph.kind === "knowledge" ? "knowledge" as const : "structural" as const, activeDomainId: keepDomainView ? activeDomainId : null, }); }, diff --git a/understand-anything-plugin/skills/understand-knowledge/SKILL.md b/understand-anything-plugin/skills/understand-knowledge/SKILL.md new file mode 100644 index 0000000..fff7f73 --- /dev/null +++ b/understand-anything-plugin/skills/understand-knowledge/SKILL.md @@ -0,0 +1,420 @@ +--- +name: understand-knowledge +description: Analyze a markdown knowledge base (Obsidian, Logseq, Dendron, Foam, Karpathy-style, Zettelkasten, or plain) to produce an interactive knowledge graph with typed relationships +argument-hint: [path/to/notes] [--ingest ] +--- + +# /understand-knowledge + +Analyze a folder of markdown notes and produce a `knowledge-graph.json` file in `.understand-anything/` with `kind: "knowledge"`. This file powers the interactive dashboard for exploring a personal knowledge base's structure, topics, entities, claims, and relationships. + +## Options + +- `$ARGUMENTS` may contain: + - A directory path — Point at a specific notes folder (defaults to CWD) + - `--ingest ` — Incrementally add new files to an existing knowledge graph without rescanning the entire vault + +--- + +## Phase 0 — Pre-flight + +Determine target directory and whether to run a full scan or incremental ingest. + +1. Set `TARGET_DIR`: + - If `$ARGUMENTS` contains a directory path (not prefixed with `--`), resolve it to an absolute path. + - Otherwise, use the current working directory. + +2. Get the current git commit hash (or `"no-git"` if not a git repo): + ```bash + git rev-parse HEAD 2>/dev/null || echo "no-git" + ``` + +3. Create the intermediate output directory: + ```bash + mkdir -p $TARGET_DIR/.understand-anything/intermediate + ``` + +4. **Check for `--ingest` flag:** + - If `--ingest` IS in `$ARGUMENTS`: + - Verify `$TARGET_DIR/.understand-anything/knowledge-graph.json` exists. If not, report "No existing knowledge graph found. Run `/understand-knowledge` first to create one." and STOP. + - Read the existing graph and store as `$EXISTING_GRAPH`. + - Read `$TARGET_DIR/.understand-anything/meta.json` to get `knowledgeFormat`. Store as `$KNOWN_FORMAT`. + - Resolve the ingest target path (the argument after `--ingest`). Store as `$INGEST_TARGET`. + - Set `$MODE` to `"ingest"` and skip to Phase 2 (format detection is skipped — use `$KNOWN_FORMAT`). + - If `--ingest` is NOT in `$ARGUMENTS`: + - Set `$MODE` to `"full"`. + +5. Check if `$TARGET_DIR/.understand-anything/knowledge-graph.json` exists. If it does, read it and check if `kind` is `"knowledge"`. If the existing graph is a codebase graph (`kind` is `"codebase"` or absent), warn the user: "An existing codebase graph will be replaced with a knowledge graph. Continue?" Proceed only if user confirms. + +--- + +## Phase 1 — SCAN + +Dispatch a subagent using the `knowledge-scanner` agent definition (at `agents/knowledge-scanner.md`). + +Pass these parameters in the dispatch prompt: + +> Scan this directory to discover all markdown files for knowledge base analysis. +> Target directory: `$TARGET_DIR` +> Write output to: `$TARGET_DIR/.understand-anything/intermediate/knowledge-manifest.json` + +Pass the input as JSON: +```json +{ "targetDir": "$TARGET_DIR" } +``` + +After the subagent completes, read `$TARGET_DIR/.understand-anything/intermediate/knowledge-manifest.json` to get: +- Total file count +- File list with paths, sizes, first-lines previews +- Directory structure signatures (e.g., `.obsidian/`, `logseq/`, `raw/` + `wiki/`) + +Store the file list as `$FILE_LIST` and the total count as `$TOTAL_FILES`. + +Report to the user: **"Scanned $TOTAL_FILES markdown files."** + +**Gate check:** If >500 files, inform the user and suggest scoping with a subdirectory argument. Proceed only if user confirms or add guidance that this may take a while. + +--- + +## Phase 2 — FORMAT DETECTION + +> **Ingest mode:** Skip this phase entirely. Use `$KNOWN_FORMAT` from Phase 0. + +Dispatch a subagent using the `format-detector` agent definition (at `agents/format-detector.md`). + +Pass these parameters in the dispatch prompt: + +> Detect the knowledge base format for the scanned directory. +> Target directory: `$TARGET_DIR` +> Read the manifest at: `$TARGET_DIR/.understand-anything/intermediate/knowledge-manifest.json` +> Write output to: `$TARGET_DIR/.understand-anything/intermediate/format-detection.json` + +After the subagent completes, read `$TARGET_DIR/.understand-anything/intermediate/format-detection.json` to get: +- `format`: one of `"obsidian"`, `"logseq"`, `"dendron"`, `"foam"`, `"karpathy"`, `"zettelkasten"`, `"plain"` +- `confidence`: a confidence score (0-1) +- `hints`: format-specific parsing hints for downstream agents + +Store as `$DETECTED_FORMAT`, `$FORMAT_CONFIDENCE`, and `$FORMAT_HINTS`. + +Report to the user: **"Detected format: $DETECTED_FORMAT (confidence: $FORMAT_CONFIDENCE)"** + +--- + +## Phase 3 — ANALYZE + +### Prepare format guide + +1. Determine the format to use: + - Full mode: use `$DETECTED_FORMAT` from Phase 2. + - Ingest mode: use `$KNOWN_FORMAT` from Phase 0. +2. Read the corresponding format guide from `skills/understand-knowledge/formats/.md` (these files are in the `formats/` subdirectory next to this SKILL.md file — use the skill directory path, not the project root). If the file does not exist, fall back to `formats/plain.md`. +3. Store the format guide content as `$FORMAT_GUIDE`. + +### Determine files to analyze + +- **Full mode:** Use `$FILE_LIST` from Phase 1. +- **Ingest mode:** Scan only the `$INGEST_TARGET` path: + - If it is a single file, the batch is just that file. + - If it is a directory, discover all `.md` files recursively within it. + - Store as `$INGEST_FILES`. + +### Batch and dispatch + +Batch the files into groups of **15-25 files each** (aim for ~20 files per batch for balanced sizes). + +**Batching strategy:** +- Group files in the same subdirectory together when possible (preserves topical locality). +- Keep daily notes / journal entries together in the same batch. +- Each file's size and preview from the manifest should be included. + +For each batch, dispatch a subagent using the `article-analyzer` agent definition (at `agents/article-analyzer.md`). Run up to **5 subagents concurrently** using parallel dispatch. Append the following additional context: + +> **Additional context from main session:** +> +> Knowledge base format: `$DETECTED_FORMAT` (or `$KNOWN_FORMAT` in ingest mode) +> +> Format guide: +> ``` +> $FORMAT_GUIDE +> ``` +> +> Format-specific hints: +> ```json +> $FORMAT_HINTS +> ``` + +Fill in batch-specific parameters below and dispatch: + +> Analyze these markdown files and produce knowledge graph nodes (article, entity, topic, claim, source) and edges. +> Target directory: `$TARGET_DIR` +> Batch index: `` +> Write output to: `$TARGET_DIR/.understand-anything/intermediate/article-batch-.json` +> +> Files to analyze in this batch: +> 1. `` ( lines) +> 2. `` ( lines) +> ... + +After ALL batches complete, verify that all expected `article-batch-*.json` files exist. Read each one and track any per-batch warnings. + +Report progress to the user: **"Analyzed $ANALYZED_FILES files in $BATCH_COUNT batches."** + +--- + +## Phase 4 — RELATIONSHIPS + +Dispatch a subagent using the `relationship-builder` agent definition (at `agents/relationship-builder.md`). + +### Full mode + +Pass these parameters in the dispatch prompt: + +> Discover cross-file relationships across all analyzed articles. +> Target directory: `$TARGET_DIR` +> Read all article batch files at: `$TARGET_DIR/.understand-anything/intermediate/article-batch-*.json` +> Write output to: `$TARGET_DIR/.understand-anything/intermediate/relationships.json` +> +> Knowledge base format: `$DETECTED_FORMAT` +> +> Build the following relationship types: +> - `builds_on`: article → article (extends, refines, deepens) +> - `contradicts`: claim → claim (conflicts or disagrees) +> - `categorized_under`: article/entity → topic (thematic grouping) +> - `exemplifies`: entity → concept/topic (concrete example of) +> - `cites`: article → source (references or draws from) +> - `authored_by`: article → entity (written by) +> +> Also produce: +> - Topic nodes: cluster related articles into topics +> - Layers: group nodes by thematic hierarchy (topics at top, articles in middle, entities/claims/sources at bottom) + +### Ingest mode + +Pass additional context to the relationship-builder: + +> **Existing graph context:** +> The existing knowledge graph has these nodes and edges (summary): +> - Node IDs: `[list of existing node IDs]` +> - Topics: `[list of existing topic nodes with names]` +> - Layers: `[existing layer definitions]` +> +> Find relationships between the NEW nodes from the ingest batch and the EXISTING nodes. Reuse existing topic nodes where appropriate rather than creating duplicates. + +After the subagent completes, read `$TARGET_DIR/.understand-anything/intermediate/relationships.json` to get: +- Cross-file edges +- Topic nodes (with `categorized_under` edges to articles) +- Layer definitions + +Store as `$RELATIONSHIPS`, `$TOPIC_NODES`, and `$LAYERS`. + +Report to the user: **"Discovered $EDGE_COUNT cross-file relationships and $TOPIC_COUNT topics."** + +--- + +## Phase 5 — ASSEMBLE + +Merge all intermediate results into the final KnowledgeGraph structure. + +1. **Collect all nodes:** + - Read all `article-batch-*.json` files and collect their nodes. + - Add topic nodes from `$TOPIC_NODES` (Phase 4). + - In ingest mode: also include all nodes from `$EXISTING_GRAPH`. + +2. **Collect all edges:** + - Read all `article-batch-*.json` files and collect their edges (explicit wikilinks, tags, frontmatter-derived edges). + - Add cross-file edges from `$RELATIONSHIPS` (Phase 4). + - In ingest mode: also include all edges from `$EXISTING_GRAPH`. + +3. **Deduplicate:** + - Nodes: deduplicate by `id` (keep last occurrence — newer analysis wins). + - Edges: deduplicate by `(source, target, type)` tuple (keep last occurrence). + +4. **Drop dangling edges:** Remove any edge whose `source` or `target` does not exist in the final node set. + +5. **Normalize layers:** + - Use layer definitions from Phase 4. + - Ensure every node is assigned to exactly one layer. + - Drop any `nodeIds` entries that do not exist in the final node set. + - Each layer must have: `id`, `name`, `description`, `nodeIds`. + +6. **Assemble the KnowledgeGraph object:** + + ```json + { + "version": "1.0.0", + "kind": "knowledge", + "project": { + "name": "", + "languages": ["markdown"], + "frameworks": [""], + "description": "", + "analyzedAt": "", + "gitCommitHash": "" + }, + "nodes": [], + "edges": [], + "layers": [], + "tour": [] + } + ``` + + Note: `tour` is left as an empty array — the relationship-builder does not generate tours for knowledge graphs. A future enhancement may add guided tours through knowledge bases. + +7. Write the assembled graph to `$TARGET_DIR/.understand-anything/intermediate/assembled-graph.json`. + +--- + +## Phase 6 — REVIEW + +Dispatch a subagent using the `graph-reviewer` agent definition (at `agents/graph-reviewer.md`). Append the following additional context: + +> **Additional context from main session:** +> +> This is a knowledge graph (`kind: "knowledge"`), not a codebase graph. +> +> Knowledge-specific node types: `article`, `entity`, `topic`, `claim`, `source` +> Knowledge-specific edge types: `cites`, `contradicts`, `builds_on`, `exemplifies`, `categorized_under`, `authored_by` +> +> Phase warnings/errors accumulated during analysis: +> - [list any batch failures, skipped files, or warnings from Phases 1-5] +> +> Validate the following knowledge-specific constraints: +> - Every `article` node should have at least one edge (no orphan articles) +> - Every `topic` node should have at least one `categorized_under` edge pointing to it +> - Entity names should be consistent (no duplicates like "Obsidian" vs "obsidian") +> - Edge weights follow the conventions below + +Pass these parameters in the dispatch prompt: + +> Validate the knowledge graph at `$TARGET_DIR/.understand-anything/intermediate/assembled-graph.json`. +> Project root: `$TARGET_DIR` +> Read the file and validate it for completeness and correctness. +> Write output to: `$TARGET_DIR/.understand-anything/intermediate/review.json` + +After the subagent completes, read `$TARGET_DIR/.understand-anything/intermediate/review.json`. + +**If `issues` array is non-empty:** +- Review the `issues` list. +- Apply automated fixes where possible: + - Remove edges with dangling references. + - Merge duplicate entity nodes (keep the one with more edges). + - Fill missing required fields with sensible defaults (e.g., empty `tags` -> `["untagged"]`, empty `summary` -> `"No summary available"`). +- Re-run validation after automated fixes. +- If critical issues remain after one fix attempt, save the graph anyway but include the warnings in the final report and mark dashboard auto-launch as skipped. + +**If `issues` array is empty:** Proceed to Phase 7. + +--- + +## Phase 7 — SAVE + +1. Write the final knowledge graph to `$TARGET_DIR/.understand-anything/knowledge-graph.json`. + +2. Write metadata to `$TARGET_DIR/.understand-anything/meta.json`: + ```json + { + "lastAnalyzedAt": "", + "gitCommitHash": "", + "version": "1.0.0", + "analyzedFiles": , + "knowledgeFormat": "" + } + ``` + +3. Clean up intermediate files: + ```bash + rm -rf $TARGET_DIR/.understand-anything/intermediate + ``` + +4. Report a summary to the user containing: + - Knowledge base name and detected format + - Files analyzed / total files + - Nodes created (broken down by type: article, entity, topic, claim, source) + - Edges created (broken down by type: cites, contradicts, builds_on, exemplifies, categorized_under, authored_by, plus any explicit link edges) + - Layers identified (with names) + - Any warnings from the reviewer + - Path to the output file: `$TARGET_DIR/.understand-anything/knowledge-graph.json` + +--- + +## Phase 8 — DASHBOARD + +Only automatically launch the dashboard by invoking the `/understand-dashboard` skill if final graph validation passed after review fixes. + +If final validation did not pass, report that the graph was saved with warnings and dashboard launch was skipped. + +--- + +## Incremental Mode (`--ingest`) — Abbreviated Pipeline + +When `--ingest` is specified, the pipeline runs an abbreviated flow: + +| Phase | Full Mode | Ingest Mode | +|-------|-----------|-------------| +| Phase 0 — Pre-flight | Determine target, create dirs | Verify existing graph, load format, resolve ingest target | +| Phase 1 — SCAN | Scan entire directory | Scan only `$INGEST_TARGET` (single file or folder) | +| Phase 2 — FORMAT DETECTION | Detect format from scratch | **SKIPPED** — use `knowledgeFormat` from `meta.json` | +| Phase 3 — ANALYZE | Analyze all files | Analyze only new/changed files from `$INGEST_TARGET` | +| Phase 4 — RELATIONSHIPS | Build relationships across all nodes | Build relationships between NEW nodes and EXISTING graph | +| Phase 5 — ASSEMBLE | Merge all intermediate results | Merge new results INTO existing graph (preserve existing nodes/edges) | +| Phase 6 — REVIEW | Full validation | Full validation on merged graph | +| Phase 7 — SAVE | Write graph + meta | Write merged graph + updated meta | +| Phase 8 — DASHBOARD | Auto-trigger dashboard | Auto-trigger dashboard | + +--- + +## Error Handling + +- If any subagent dispatch fails, retry **once** with the same prompt plus additional context about the failure. +- Track all warnings and errors from each phase in a `$PHASE_WARNINGS` list. Pass this list to the graph-reviewer in Phase 6. +- If it fails a second time, skip that phase and continue with partial results. +- ALWAYS save partial results — a partial graph is better than no graph. +- Report any skipped phases or errors in the final summary so the user knows what happened. +- NEVER silently drop errors. Every failure must be visible in the final report. + +--- + +## Reference: Knowledge Graph Schema + +### Knowledge Node Types (5) + +| Type | Description | ID Convention | +|---|---|---| +| `article` | A wiki/note page — the primary content unit | `article:` | +| `entity` | A named thing: person, tool, paper, org, project | `entity:` | +| `topic` | A thematic cluster grouping related articles | `topic:` | +| `claim` | A specific assertion, insight, or takeaway | `claim::` | +| `source` | Raw/reference material that articles cite | `source:` | + +### Knowledge Edge Types (6) + +| Type | Direction | Weight | Meaning | +|---|---|---|---| +| `cites` | article → source | 0.7 | References or draws from | +| `contradicts` | claim → claim | 0.9 | Conflicts or disagrees with | +| `builds_on` | article → article | 0.8 | Extends, refines, or deepens | +| `exemplifies` | entity → concept/topic | 0.6 | Is a concrete example of | +| `categorized_under` | article/entity → topic | 0.5 | Belongs to this theme | +| `authored_by` | article → entity | 0.5 | Written or created by | + +### Shared Edge Types (also used) + +| Type | Weight | Usage in Knowledge Graphs | +|---|---|---| +| `contains` | 1.0 | Topic contains subtopics | +| `related` | 0.5 | General semantic similarity between articles | +| `similar_to` | 0.5 | Near-duplicate or highly overlapping content | +| `documents` | 0.5 | Article documents an entity | + +### KnowledgeMeta (on GraphNode) + +```typescript +interface KnowledgeMeta { + format?: "obsidian" | "logseq" | "dendron" | "foam" | "karpathy" | "zettelkasten" | "plain"; + wikilinks?: string[]; // outgoing [[wikilinks]] found in this file + backlinks?: string[]; // files that link TO this file + frontmatter?: Record; // parsed YAML frontmatter + sourceUrl?: string; // external URL for source nodes + confidence?: number; // 0-1, for LLM-inferred relationships +} +``` diff --git a/understand-anything-plugin/skills/understand-knowledge/formats/dendron.md b/understand-anything-plugin/skills/understand-knowledge/formats/dendron.md new file mode 100644 index 0000000..1ac5afa --- /dev/null +++ b/understand-anything-plugin/skills/understand-knowledge/formats/dendron.md @@ -0,0 +1,205 @@ +# Dendron Format Guide + +## Detection + +Identify a Dendron workspace by the presence of a `dendron.yml` configuration file at the workspace root. Dendron uses a **dot-delimited hierarchy** for note names, which is its most distinctive feature. + +**Directory signatures:** +- `dendron.yml` — workspace configuration (required) +- `dendron.code-workspace` — VS Code workspace file +- `*.schema.yml` — schema definition files +- Notes named with dot-delimited paths: `project.backend.api.md` +- `vault/` or custom vault directories + +## Link Syntax + +### Wikilinks +``` +[[note.name]] +``` +The target is the dot-delimited note hierarchy path (filename without `.md`). + +### Wikilinks with Alias +``` +[[Display Text|note.name]] +``` +Note: Dendron places the alias **before** the pipe, opposite to Obsidian. + +### Link to Heading +``` +[[note.name#heading-text]] +[[Display Text|note.name#heading-text]] +``` + +### Cross-Vault Links +Used in multi-vault workspaces to link across vaults: +``` +[[dendron://vault-name/note.name]] +[[Display Text|dendron://vault-name/note.name]] +[[Display Text|dendron://vault-name/note.name#heading]] +``` +The prefix `dendron://$vaultName/` converts any regular link into a cross-vault link. + +### Note References (Embeds) +Embed content from another note: +``` +![[note.name]] +![[dendron://vault-name/note.name]] +![[note.name#heading]] +``` + +### Block Anchors +Target specific blocks using `^anchor-id`: +``` +![[note.name#^anchor-id]] +``` + +## Metadata + +### YAML Frontmatter +Every Dendron note has mandatory YAML frontmatter with auto-generated fields: + +```yaml +--- +id: 7f2a3b4c-5d6e-4f8a-9b0c-1d2e3f4a5b6c +title: My Note Title +desc: A brief description of the note +updated: 1636492098692 +created: 1636492098692 +tags: + - my.tag.name +--- +``` + +**Required fields:** +- `id` — UUID uniquely identifying the note (stable across renames) +- `title` — display title of the note +- `desc` — description (can be empty string) +- `updated` — Unix timestamp in milliseconds of last update +- `created` — Unix timestamp in milliseconds of creation + +**Optional fields:** +- `tags` — array of dot-delimited tag names +- `nav_order` — numeric order for navigation +- `nav_exclude` — exclude from navigation +- `config` — per-note configuration overrides + +### Tags as Hierarchy +Tags in Dendron are themselves notes in the hierarchy: +```yaml +tags: my.example +``` +This links the note to the page `tags.my.example` in the hierarchy. + +## Folder Structure + +Dendron's folder structure is driven by its **dot-delimited hierarchy**. All notes live flat in the vault root directory — hierarchy is encoded in the filename, not in subdirectories. + +``` +workspace-root/ + dendron.yml + dendron.code-workspace + vault/ + root.md # vault root note + project.md # "project" node + project.backend.md # "project > backend" node + project.backend.api.md # "project > backend > api" node + project.frontend.md # "project > frontend" node + project.frontend.ui.md # "project > frontend > ui" node + root.schema.yml # root schema + project.schema.yml # schema for project hierarchy +``` + +**Hierarchy rules:** +- `a.b.c.md` is a child of `a.b.md` which is a child of `a.md` +- Each dot represents one level of depth +- Parent notes need not exist — Dendron creates **stubs** (empty placeholder notes) for missing parents +- `root.md` is the root of every vault + +### Multi-Vault Workspaces +``` +workspace-root/ + dendron.yml # lists all vaults + vault-personal/ + daily.2026.04.10.md + vault-work/ + project.alpha.md +``` + +## Tags + +Tags in Dendron are **notes in the hierarchy** under the `tags.` prefix: + +```yaml +--- +tags: + - my-tag + - project.active +--- +``` + +- The tag `my-tag` corresponds to the note `tags.my-tag.md` +- Tags can be hierarchical: `project.active` maps to `tags.project.active.md` +- Tags are listed in frontmatter, not inline with `#` syntax +- You can also use `#my-tag` inline, which is equivalent to `[[tags.my-tag]]` + +## Special Files + +| Path | Purpose | Action | +|------|---------|--------| +| `dendron.yml` | Workspace configuration | **Parse** — lists vaults, settings | +| `dendron.code-workspace` | VS Code workspace | Ignore | +| `*.schema.yml` | Schema definitions | **Parse** — defines hierarchy structure and templates | +| `root.md` | Vault root note | Parse as a content node | +| `root.schema.yml` | Root schema | Parse for hierarchy rules | +| `.dendron.cache.json` | Note metadata cache | Can parse for performance | +| `pods/` | Import/export configurations | Ignore | + +### Schema Files +Schema files (`*.schema.yml`) define the expected structure of note hierarchies: +```yaml +version: 1 +schemas: + - id: project + title: Project + children: + - id: backend + title: Backend + - id: frontend + title: Frontend +``` +Schemas provide autocomplete hints and enforce consistency but do not prevent creating notes outside the schema. + +## Parsing Instructions for LLM + +1. **Detect the workspace**: Look for `dendron.yml` at the root. If found, confirm Dendron format. + +2. **Parse dendron.yml**: Extract vault configurations — names, paths, and settings. Identify all vault directories. + +3. **Enumerate notes**: For each vault directory, find all `.md` files. Every `.md` file (except schema files) is a note. + +4. **Parse the hierarchy**: For each note filename (without `.md`): + - Split on `.` to determine hierarchy depth + - `a.b.c` means: root > a > b > c + - Create implicit parent nodes for any missing intermediate levels (stubs) + - The hierarchy itself defines the primary graph structure + +5. **Parse frontmatter**: Extract the mandatory `id`, `title`, `desc`, `created`, `updated` fields. The `id` field is the stable identifier — use it for edge targets when possible. + +6. **Extract wikilinks**: Scan note body for `[[...]]` patterns. Parse into: + - `[[target.note]]` — link by hierarchy path + - `[[Display|target.note]]` — aliased link + - `[[target.note#heading]]` — heading link + - `[[dendron://vault/target.note]]` — cross-vault link + - `![[target.note]]` — embed + +7. **Parse schemas**: Read `*.schema.yml` files to understand the intended hierarchy structure. Use this to categorize notes and identify whether a note follows an expected pattern. + +8. **Identify stubs**: Notes that exist only as hierarchy placeholders (created by Dendron when a child exists but parent does not) typically have minimal or empty content with auto-generated frontmatter. Flag these in the graph. + +9. **Build the graph**: Create two types of edges: + - **Hierarchy edges**: parent-child relationships from the dot-delimited naming (e.g., `project` -> `project.backend`) + - **Reference edges**: wikilinks and embeds between notes + - **Tag edges**: frontmatter tags linking to `tags.*` hierarchy nodes + +10. **Handle multi-vault**: If multiple vaults exist, namespace nodes by vault to avoid collisions. Cross-vault links (`dendron://`) explicitly specify the vault. diff --git a/understand-anything-plugin/skills/understand-knowledge/formats/foam.md b/understand-anything-plugin/skills/understand-knowledge/formats/foam.md new file mode 100644 index 0000000..d88094a --- /dev/null +++ b/understand-anything-plugin/skills/understand-knowledge/formats/foam.md @@ -0,0 +1,190 @@ +# Foam Format Guide + +## Detection + +Identify a Foam workspace by the presence of Foam-specific VS Code configuration. Foam is a VS Code extension that works on top of standard Markdown files, so detection relies on configuration files rather than note format. + +**Directory signatures:** +- `.vscode/extensions.json` containing `"foam.foam-vscode"` in recommendations +- `.vscode/settings.json` containing `foam.*` settings +- `.foam/` directory (older versions) +- Link reference definitions at the bottom of `.md` files (distinctive Foam feature) + +**Secondary indicators:** +- A `.vscode/` directory with Foam-related settings +- Markdown files with auto-generated reference definition blocks at the bottom + +## Link Syntax + +### Wikilinks +``` +[[note-name]] +``` +Links to `note-name.md` in the workspace. Foam uses filenames (without extension) as identifiers. + +### Section Links +``` +[[note-name#Section Title]] +``` +Links to a specific heading within a note. Autocomplete is available for section titles. + +### Block Links +``` +[[note-name#^block-id]] +``` +Links to a specific block. The target block must have a `^block-id` anchor at its end. + +### Embeds +``` +![[note-name]] +![[note-name#Section Title]] +``` +Embeds the content of another note or section inline. + +### Directory Links +``` +[[projects]] +``` +Linking to a folder name opens the folder's index file (`projects/index.md` or `projects/README.md`). + +### Standard Markdown Links +Also fully supported: +``` +[Display Text](other-file.md) +[Display Text](other-file.md#section-name) +``` + +### Placeholder Links +Wikilinks to non-existent notes are displayed as **placeholders** (visually distinct in VS Code). Clicking a placeholder creates the note. These indicate intended-but-not-yet-written content. + +## Metadata + +### YAML Frontmatter +Foam supports standard YAML frontmatter: +```yaml +--- +title: My Note +type: concept +tags: + - research + - machine-learning +--- +``` + +There are no Foam-specific required fields. The frontmatter is freeform and follows standard YAML conventions. + +### Link Reference Definitions +Foam's most distinctive feature. The extension auto-generates standard Markdown link reference definitions at the bottom of each file: + +```markdown +# My Note + +This relates to [[Data Science]] and [[Statistics]]. + + + +[Data Science]: data-science.md "Data Science" +[Statistics]: statistics.md "Statistics" +``` + +**Key characteristics:** +- Generated automatically on file save +- Placed at the end of the file, separated by a blank line +- Make wikilinks compatible with standard Markdown processors +- Format: `[Note Name]: relative-path.md "Note Title"` + +**Configuration options** (in `.vscode/settings.json`): +- `"foam.edit.linkReferenceDefinitions": "off"` — disabled +- `"foam.edit.linkReferenceDefinitions": "withoutExtensions"` — paths without `.md` +- `"foam.edit.linkReferenceDefinitions": "withExtensions"` — paths with `.md` + +## Folder Structure + +Foam imposes no required folder structure. Users organize notes freely. Common patterns: + +``` +workspace-root/ + .vscode/ + extensions.json # Recommends foam.foam-vscode + settings.json # Foam configuration + docs/ + topic-a.md + topic-b.md + journal/ + 2026-04-10.md # Daily notes (if configured) + attachments/ + image.png + readme.md +``` + +Foam supports templates for note creation, typically stored in a `.foam/templates/` directory: +``` +.foam/ + templates/ + new-note.md + daily-note.md +``` + +## Tags + +Foam supports tags in two ways: + +### Frontmatter Tags +```yaml +--- +tags: + - research + - ai +--- +``` + +### Inline Tags +``` +This is about #machine-learning and #research. +``` + +Tags are searchable through VS Code's tag explorer panel. There is no special hierarchy or nesting convention for tags in Foam. + +## Special Files + +| Path | Purpose | Action | +|------|---------|--------| +| `.vscode/settings.json` | Foam configuration | **Parse** — contains Foam settings | +| `.vscode/extensions.json` | Extension recommendations | **Check** — confirms Foam format | +| `.foam/` | Foam workspace data | Ignore | +| `.foam/templates/` | Note templates | **Identify but skip** — scaffolding, not knowledge | +| `_layouts/`, `_site/` | Static site generator output | **Skip entirely** | +| `readme.md` / `index.md` | Workspace root document | Parse as a content node | + +## Parsing Instructions for LLM + +1. **Detect the workspace**: Look for `.vscode/extensions.json` containing `foam.foam-vscode`, or `.vscode/settings.json` with `foam.*` keys. If found, confirm Foam format. + +2. **Read configuration**: Parse `.vscode/settings.json` for Foam-specific settings: + - `foam.edit.linkReferenceDefinitions` — whether reference definitions are generated + - `foam.files.ignore` — patterns for files to skip + - `foam.graph.style` — graph display settings + +3. **Enumerate notes**: Find all `.md` files. Exclude `.vscode/`, `.foam/templates/`, `node_modules/`, `_site/`, and any paths in `foam.files.ignore`. + +4. **Parse frontmatter**: Extract YAML frontmatter if present. Foam does not require frontmatter, so many notes may lack it. + +5. **Extract wikilinks**: Scan note body for `[[...]]` patterns: + - `[[target]]` — link to note + - `[[target#heading]]` — link to heading + - `[[target#^block-id]]` — link to block + - `![[target]]` — embed + +6. **Handle link reference definitions**: At the bottom of files, look for lines matching: + ``` + [Note Name]: relative-path.md "Title" + ``` + These are auto-generated by Foam. They provide a mapping from wikilink display names to file paths. Use these to resolve wikilink targets when present, but note they may be stale if the user disabled auto-update. + +7. **Identify placeholders**: Wikilinks that do not resolve to any existing file are placeholders — notes the author intends to write. Include these as stub nodes in the graph to show intended structure. + +8. **Extract standard Markdown links**: Also parse `[text](path.md)` links, as Foam users may mix both syntaxes. + +9. **Build the graph**: Create nodes for each note file and each placeholder. Create directed edges for all wikilinks, standard links, and embeds. The link reference definitions can serve as a cross-reference for validating link resolution. + +10. **Note on compatibility**: Foam notes are designed to be valid standard Markdown. The link reference definitions ensure that even without Foam, the links resolve correctly in any Markdown renderer. This means the notes are also parseable as plain Markdown if Foam detection fails. diff --git a/understand-anything-plugin/skills/understand-knowledge/formats/karpathy.md b/understand-anything-plugin/skills/understand-knowledge/formats/karpathy.md new file mode 100644 index 0000000..837d0e9 --- /dev/null +++ b/understand-anything-plugin/skills/understand-knowledge/formats/karpathy.md @@ -0,0 +1,196 @@ +# Karpathy LLM Wiki Format Guide + +## Detection + +Identify a Karpathy-style LLM wiki by the presence of the characteristic three-layer directory structure with `raw/` and `wiki/` directories, and a schema document (typically `CLAUDE.md` or similar) at the root. + +**Directory signatures:** +- `raw/` — immutable source documents +- `wiki/` — LLM-generated and maintained markdown pages +- `wiki/index.md` — master catalog of all wiki pages +- `wiki/log.md` — append-only operational history +- `_meta/` — optional state and metadata directory +- `CLAUDE.md` or similar schema file at root — defines wiki conventions + +**Secondary indicators:** +- Markdown files with consistent cross-reference patterns maintained by an LLM +- `index.md` containing a structured catalog with one-line summaries +- `log.md` with timestamped operation entries + +## Link Syntax + +The Karpathy wiki uses **standard Markdown links** (not wikilinks), as the wiki is designed to be readable in any Markdown renderer including Obsidian. + +### Standard Markdown Links +``` +[Page Title](page-name.md) +[Page Title](subfolder/page-name.md) +``` + +### Links with Section Anchors +``` +[Section Name](page-name.md#section-heading) +``` + +### Cross-References +LLM-maintained cross-references between related pages: +``` +## See Also +- [Related Concept A](concept-a.md) +- [Related Concept B](concept-b.md) +- [Source Summary](summaries/source-title.md) +``` + +### Source Citations +Links back to raw source documents: +``` +[Source: Original Article](../raw/article-title.md) +``` + +**Note:** Since an LLM maintains all links, the linking style may vary based on the schema document's conventions. The LLM ensures consistency within a given wiki instance. + +## Metadata + +### YAML Frontmatter +Wiki pages typically include frontmatter with metadata about the page's provenance and status: + +```yaml +--- +title: Concept Name +type: concept | entity | summary | synthesis | comparison +sources: + - raw/article-1.md + - raw/article-2.md +created: 2026-04-10 +updated: 2026-04-10 +tags: + - machine-learning + - transformers +--- +``` + +**Common fields:** +- `title` — page display title +- `type` — page category (entity, concept, summary, synthesis, comparison) +- `sources` — list of raw source documents that informed this page +- `created` / `updated` — timestamps +- `tags` — topic tags + +### Dataview Compatibility +If Obsidian is the reading interface, pages may include Dataview-compatible frontmatter for dynamic queries. + +## Folder Structure + +The three-layer architecture: + +``` +wiki-root/ + CLAUDE.md # Schema: conventions, workflows, page types + raw/ # Layer 1: Immutable source documents + article-title.md # Clipped articles, papers, notes + paper-summary.md + assets/ # Downloaded images from sources + figure-1.png + wiki/ # Layer 2: LLM-maintained knowledge pages + index.md # Master catalog of all pages + log.md # Append-only operation history + entities/ # Entity pages (people, orgs, tools) + transformer.md + attention-mechanism.md + concepts/ # Concept pages (ideas, theories) + scaling-laws.md + summaries/ # Source summaries + article-title-summary.md + syntheses/ # Cross-source synthesis + comparison-x-vs-y.md + _meta/ # Optional: state tracking + ingest-queue.md + review-status.json +``` + +**Layer roles:** +- **raw/** — read-only source material. The LLM reads but never modifies these files. +- **wiki/** — the durable artifact. Knowledge compounds here through repeated LLM updates. +- **Schema** (CLAUDE.md) — the configuration layer defining how the LLM should behave. + +### index.md Structure +A content-oriented catalog organized by category: +```markdown +# Wiki Index + +## Entities +- [Transformer](entities/transformer.md) — Neural network architecture based on self-attention +- [GPT-4](entities/gpt-4.md) — OpenAI's large language model (3 sources) + +## Concepts +- [Scaling Laws](concepts/scaling-laws.md) — Empirical relationships between model size and performance + +## Summaries +- [Attention Is All You Need](summaries/attention-paper.md) — Original transformer paper summary +``` + +### log.md Structure +Append-only with consistent prefixes: +```markdown +## [2026-04-10] ingest | Attention Is All You Need +- Wrote summary: wiki/summaries/attention-paper.md +- Created entity: wiki/entities/transformer.md +- Updated 3 existing pages with cross-references + +## [2026-04-09] query | How do scaling laws affect training cost? +- Synthesized answer from 4 wiki pages +- Filed new page: wiki/syntheses/scaling-cost-analysis.md +``` + +## Tags + +Tags are stored in YAML frontmatter and maintained by the LLM: +```yaml +tags: + - machine-learning + - nlp + - transformers +``` + +There is no special tag syntax beyond frontmatter. The LLM ensures consistent tag usage across pages during ingestion and lint passes. + +## Special Files + +| Path | Purpose | Action | +|------|---------|--------| +| `CLAUDE.md` (or schema file) | Wiki schema and conventions | **Parse first** — defines all conventions for this wiki instance | +| `wiki/index.md` | Master catalog | **Parse** — provides the complete page inventory with summaries | +| `wiki/log.md` | Operation history | **Parse** — shows wiki evolution and recent activity | +| `raw/` | Source documents | **Parse as source nodes** — immutable inputs | +| `raw/assets/` | Source images | Ignore | +| `_meta/` | State tracking | **Parse if present** — may contain queue and status info | + +## Parsing Instructions for LLM + +1. **Detect the wiki**: Look for `wiki/index.md` and `raw/` directory together. If a schema file (CLAUDE.md, README.md with wiki conventions) exists at root, this confirms the Karpathy pattern. + +2. **Parse the schema first**: Read the root schema document (CLAUDE.md or equivalent). This defines the specific conventions for this wiki instance — page types, naming rules, workflow descriptions. The schema overrides any generic assumptions. + +3. **Parse index.md**: This is the master catalog. Extract all listed pages with their titles, paths, summaries, and categories. This provides the complete graph node inventory without scanning the filesystem. + +4. **Parse log.md**: Extract operation entries to understand wiki history. Each entry follows `## [DATE] operation | Title` format. This reveals temporal relationships and provenance chains. + +5. **Enumerate wiki pages**: Scan `wiki/` for all `.md` files (excluding `index.md` and `log.md` which are structural). Parse frontmatter for metadata, especially `type` and `sources` fields. + +6. **Enumerate raw sources**: Scan `raw/` for all source documents. These are leaf nodes — they have no outgoing links within the wiki but are referenced by wiki pages via `sources` frontmatter. + +7. **Extract links**: Parse standard Markdown links `[text](path.md)` from each wiki page. Resolve relative paths to identify link targets. + +8. **Build the graph with typed edges**: + - **Source edges**: wiki page -> raw source (from `sources` frontmatter) + - **Cross-reference edges**: wiki page -> wiki page (from inline links) + - **Category edges**: wiki page -> category (from `type` frontmatter or index.md grouping) + - **Temporal edges**: entries in log.md show which pages were created/updated together + +9. **Identify page types**: Categorize nodes by their `type` frontmatter or by their directory location: + - Entity pages — describe specific things (people, tools, orgs) + - Concept pages — describe ideas and theories + - Summary pages — distill individual sources + - Synthesis pages — combine insights across sources + +10. **Check for staleness**: Cross-reference `wiki/index.md` against actual files. Orphaned pages (exist on disk but not in index) or missing pages (in index but not on disk) indicate maintenance gaps. diff --git a/understand-anything-plugin/skills/understand-knowledge/formats/logseq.md b/understand-anything-plugin/skills/understand-knowledge/formats/logseq.md new file mode 100644 index 0000000..c4b0450 --- /dev/null +++ b/understand-anything-plugin/skills/understand-knowledge/formats/logseq.md @@ -0,0 +1,163 @@ +# Logseq Format Guide + +## Detection + +Identify a Logseq graph by the presence of a `logseq/` directory containing `config.edn` at the graph root. Logseq graphs also have characteristic `journals/` and `pages/` directories. + +**Directory signatures:** +- `logseq/config.edn` — primary configuration file (EDN format) +- `logseq/custom.css` — custom styling +- `logseq/custom.js` — custom scripts (desktop only) +- `journals/` — daily journal entries +- `pages/` — named pages +- `assets/` — attached files +- `draws/` — Excalidraw drawings + +## Link Syntax + +Logseq is a **block-based outliner** — every line (bullet) is a block with a unique UUID. + +### Page References +``` +[[Page Name]] +``` +Creates a link to a page. If the page does not exist, it will be created on click. + +### Page Reference with Alias +``` +[Display Text]([[Page Name]]) +``` + +### Block References +``` +((block-uuid)) +``` +References a specific block by its UUID. The referenced block's content is displayed inline. UUIDs look like: `64a5f9b2-3c1e-4d8f-a9b7-1234abcd5678`. + +### Block Reference with Alias +``` +[Display Text](((block-uuid))) +``` + +### Block Embeds +``` +{{embed ((block-uuid))}} +{{embed [[Page Name]]}} +``` +Embeds the full content of a block or page inline. + +### Hashtag References +``` +#tag +#[[Multi Word Tag]] +``` +Tags are equivalent to page references in Logseq — `#tag` and `[[tag]]` both link to the same page. + +## Metadata + +### Page Properties +Properties on the **first block** of a page are page-level properties: +```markdown +type:: article +author:: John Doe +date:: 2026-04-10 +tags:: #research, #ai +``` + +### Block Properties +Properties on any non-first block are block-level properties: +```markdown +- This is a task block + priority:: high + deadline:: 2026-04-15 +``` + +### Property Syntax Rules +- Use `property-name:: value` (double colon, space) +- Property names are case-insensitive +- Values can be: text, numbers, page references `[[...]]`, tags `#...` +- Multiple values separated by commas +- Built-in properties: `type`, `tags`, `alias`, `title`, `icon`, `template`, `template-including-parent` + +## Folder Structure + +Logseq enforces a specific directory layout: + +``` +graph-root/ + logseq/ + config.edn + custom.css + pages-metadata.edn + journals/ + 2026_04_10.md # Daily journal (YYYY_MM_DD format) + 2026_04_09.md + pages/ + My Page.md # Named pages + Project Alpha.md + assets/ + image.png # Attachments + draws/ + drawing.excalidraw # Excalidraw files +``` + +**Journal naming**: Files in `journals/` follow the pattern `YYYY_MM_DD.md` by default (configurable in `config.edn` via `:journal/file-name-format`). + +**Page naming**: Files in `pages/` use the page title as filename. Namespaces use `/` in the page name which maps to `%2F` or nested directories depending on configuration. + +## Tags + +Tags in Logseq are **page references** — there is no separate tag system. + +``` +#tag +#[[Multi Word Tag]] +``` + +- `#machine-learning` creates/links to a page called "machine-learning" +- `#[[Machine Learning]]` creates/links to a page called "Machine Learning" +- Tags can appear inline in any block or as property values +- All tags are queryable as page references + +## Special Files + +| Path | Purpose | Action | +|------|---------|--------| +| `logseq/config.edn` | Graph configuration | **Parse** — contains format settings, journal config | +| `logseq/custom.css` | Custom CSS | Ignore | +| `logseq/custom.js` | Custom scripts | Ignore | +| `logseq/pages-metadata.edn` | Page metadata cache | Can parse for supplemental metadata | +| `logseq/bak/` | Backup files | **Skip entirely** | +| `.recycle/` | Deleted pages | **Skip entirely** | +| `draws/` | Excalidraw drawings | Ignore or parse separately | + +## Parsing Instructions for LLM + +1. **Detect the graph**: Look for `logseq/config.edn`. If found, confirm Logseq format. + +2. **Read config**: Parse `logseq/config.edn` to determine: + - `:editor/preferred-format` — `"markdown"` or `"org"` (default markdown) + - `:journal/file-name-format` — journal filename pattern + - `:pages-directory` and `:journals-directory` — custom directory names if overridden + +3. **Enumerate notes**: Collect all `.md` (or `.org`) files from `journals/` and `pages/`. Skip `logseq/`, `.recycle/`, `draws/`. + +4. **Parse block structure**: Logseq files are outlines. Each line starting with `- ` is a block. Indentation (two spaces per level) denotes nesting: + ``` + - Parent block + - Child block + - Grandchild block + ``` + Every block implicitly has a UUID (stored in Logseq's internal database, not always visible in the file). + +5. **Extract page properties**: The first block's properties (lines matching `key:: value` before any sub-blocks) are page-level metadata. Parse these as key-value pairs. + +6. **Extract page references**: Find all `[[Page Name]]` patterns in block content. These are directed edges to other pages. + +7. **Extract block references**: Find all `((uuid))` patterns. These reference specific blocks by UUID. Note: resolving block UUIDs to their source page requires scanning all files for matching block IDs (look for `id:: uuid` properties on blocks). + +8. **Extract tags**: Find `#tag` and `#[[Multi Word Tag]]` patterns. Map each tag to a page reference (tags are pages). + +9. **Identify journals**: Files in `journals/` represent daily entries. Parse the date from the filename (`YYYY_MM_DD`). Journal pages often serve as entry points linking to topic pages. + +10. **Build the graph**: Create page nodes for every file and every referenced-but-nonexistent page (these are effectively stubs). Create edges for page references, block references, and tag references. Label journal nodes distinctly. diff --git a/understand-anything-plugin/skills/understand-knowledge/formats/obsidian.md b/understand-anything-plugin/skills/understand-knowledge/formats/obsidian.md new file mode 100644 index 0000000..552bb1d --- /dev/null +++ b/understand-anything-plugin/skills/understand-knowledge/formats/obsidian.md @@ -0,0 +1,186 @@ +# Obsidian Format Guide + +## Detection + +Identify an Obsidian vault by the presence of a `.obsidian/` directory at the vault root. This directory contains configuration files such as `app.json`, `appearance.json`, `core-plugins.json`, `community-plugins.json`, and `workspace.json`. All notes are stored as `.md` files. Canvas files (`.canvas`) may also be present. + +**Directory signatures:** +- `.obsidian/` directory exists +- `.obsidian/app.json` — core app settings +- `.obsidian/plugins/` — installed community plugins +- `.obsidian/themes/` — installed themes + +## Link Syntax + +Obsidian uses **wikilink** syntax by default (can be configured to use standard Markdown links). + +### Basic Wikilinks +``` +[[Note Name]] +``` + +### Wikilink with Alias (Custom Display Text) +``` +[[Note Name|Display Text]] +``` +Renders as "Display Text" but links to "Note Name". + +### Link to Heading +``` +[[Note Name#Heading]] +[[Note Name#Heading|Display Text]] +``` +Multiple heading levels use multiple `#`: +``` +[[Note Name#Heading#Subheading]] +``` + +### Link to Block +``` +[[Note Name#^block-id]] +``` +The target block must have a `^block-id` anchor appended at the end of the line. + +### Embeds (Transclusion) +Prefix any link with `!` to embed the content inline: +``` +![[Note Name]] +![[Note Name#Heading]] +![[Note Name#^block-id]] +![[image.png]] +![[document.pdf]] +``` + +### Standard Markdown Links +Also supported (and used when wikilinks are disabled in settings): +``` +[Display Text](Note%20Name.md) +[Display Text](Note%20Name.md#heading) +``` + +## Metadata + +### YAML Frontmatter (Properties) +Metadata is stored in YAML frontmatter at the top of each note: +```yaml +--- +title: My Note +date: 2026-04-10 +tags: + - project + - research +aliases: + - alternative name + - another alias +cssclasses: + - wide-page +publish: true +--- +``` + +**Property types supported:** Text, List, Number, Checkbox, Date, Date & time. + +**Default properties with special meaning:** +- `tags` — note tags (array) +- `aliases` — alternative names for the note (used in link autocomplete) +- `cssclasses` — CSS classes applied to the note in reading mode +- `publish` — whether the note is published via Obsidian Publish + +### Dataview Inline Fields +The popular Dataview plugin introduces inline metadata with `key:: value` syntax: + +**Line-based (own line):** +``` +Rating:: 9 +Status:: In Progress +Author:: Jane Doe +``` + +**Bracketed (inline within text):** +``` +I would rate this a [rating:: 9] out of 10. +``` + +**Parenthesis (hidden key in reading mode):** +``` +This was published (year:: 2024) recently. +``` + +Note: Inline fields use `::` (double colon), while YAML frontmatter uses `:` (single colon). + +## Folder Structure + +Obsidian imposes no required folder structure. Users organize freely with directories and subdirectories. Folder hierarchy is purely organizational and user-defined. Some common conventions: +- `Templates/` — note templates +- `Attachments/` or `Assets/` — images and files +- `Daily Notes/` or `Journal/` — daily notes + +## Tags + +### Inline Tags +Prefixed with `#` in the note body: +``` +This is about #machine-learning and #python. +``` + +**Tag syntax rules:** +- Must contain at least one non-numeric character +- Can contain: letters, numbers, underscores `_`, hyphens `-`, forward slashes `/` +- Cannot contain spaces +- Case-insensitive for matching + +### Nested Tags +Use `/` to create tag hierarchies: +``` +#project/active +#project/archived +#reading/books/fiction +``` +Searching for `#project` also finds `#project/active` and `#project/archived`. + +### Frontmatter Tags +```yaml +--- +tags: + - machine-learning + - python +--- +``` +Frontmatter tags do NOT include the `#` prefix. + +## Special Files + +| Path | Purpose | Action | +|------|---------|--------| +| `.obsidian/` | Configuration directory | **Skip entirely** — not user content | +| `.obsidian/app.json` | App settings | Ignore | +| `.obsidian/workspace.json` | UI layout state | Ignore | +| `.obsidian/plugins/` | Community plugin configs | Ignore | +| `.trash/` | Obsidian's trash | **Skip entirely** | +| `*.canvas` | Canvas files (JSON) | Parse separately — contains nodes and edges for visual boards | +| Templates directory | Template files | **Identify but deprioritize** — they are scaffolding, not knowledge | + +## Parsing Instructions for LLM + +1. **Detect the vault**: Look for `.obsidian/` at the root. If found, confirm Obsidian format. + +2. **Enumerate notes**: Recursively find all `.md` files. Exclude `.obsidian/`, `.trash/`, and any configured attachment directories. + +3. **Parse frontmatter**: For each note, extract YAML between `---` delimiters at the file start. Capture `tags`, `aliases`, and all custom properties. + +4. **Extract wikilinks**: Scan note body for `[[...]]` patterns. Parse into components: + - `[[Target]]` — link to note + - `[[Target|Alias]]` — link with display text + - `[[Target#Heading]]` — link to heading + - `[[Target#^block-id]]` — link to block + - `![[Target]]` — embed (note the `!` prefix) + +5. **Extract inline tags**: Find all `#tag` patterns in the body (not inside code blocks). Include nested tags with `/` separators. + +6. **Extract Dataview fields**: If Dataview is likely in use (check `.obsidian/plugins/dataview/`), scan for `key:: value` patterns on their own lines and `[key:: value]` or `(key:: value)` inline. + +7. **Resolve links**: Note names in wikilinks match filenames without the `.md` extension. If multiple notes share the same name in different folders, Obsidian uses the shortest path. Aliases from frontmatter can also be link targets. + +8. **Build edges**: Each wikilink creates a directed edge from the source note to the target note. Embeds (`![[...]]`) should be typed as "embeds" rather than plain links. + +9. **Handle canvas files**: `.canvas` files are JSON with `nodes` (cards, notes, links, groups) and `edges` (connections). Parse these as separate visual graph structures. diff --git a/understand-anything-plugin/skills/understand-knowledge/formats/plain.md b/understand-anything-plugin/skills/understand-knowledge/formats/plain.md new file mode 100644 index 0000000..d6f3cc0 --- /dev/null +++ b/understand-anything-plugin/skills/understand-knowledge/formats/plain.md @@ -0,0 +1,195 @@ +# Plain Markdown Format Guide + +## Detection + +Plain Markdown is the **fallback format** when no specific knowledge management tool is detected. Use this guide when the directory does not match any other format's detection criteria. + +**Positive signals (generic Markdown collection):** +- `.md` files present without any tool-specific configuration directories +- No `.obsidian/`, `logseq/`, `dendron.yml`, `.vscode/foam.*`, or `wiki/index.md` +- Standard folder hierarchy used for organization +- Standard Markdown links (not wikilinks) + +**This format covers:** +- Personal notes directories +- Documentation folders +- GitHub wikis (without specific tool configuration) +- Any unstructured collection of Markdown files + +## Link Syntax + +Plain Markdown uses **standard Markdown links** only. No wikilink syntax. + +### Inline Links +``` +[Display Text](other-file.md) +[Display Text](subfolder/other-file.md) +[Display Text](../parent-folder/file.md) +``` + +### Links with Anchors +``` +[Section Name](file.md#heading-slug) +``` +Heading slugs are lowercase, spaces replaced with hyphens, special characters removed: +- `# My Heading Title` -> `#my-heading-title` +- `# API v2.0 (Beta)` -> `#api-v20-beta` + +### Reference-Style Links +``` +Read the [introduction][intro] first. + +[intro]: introduction.md "Introduction to the Project" +``` + +### Image Links +``` +![Alt text](images/diagram.png) +![Alt text](images/diagram.png "Title") +``` + +### Autolinks +``` + +``` + +### Relative Path Resolution +Links are relative to the current file's location: +- `[link](sibling.md)` — same directory +- `[link](sub/child.md)` — subdirectory +- `[link](../other.md)` — parent directory + +## Metadata + +### YAML Frontmatter (Optional) +Some plain Markdown files include frontmatter, but it is not required or standardized: + +```yaml +--- +title: My Document +author: Jane Doe +date: 2026-04-10 +tags: + - tutorial + - getting-started +--- +``` + +Without frontmatter, the document title is inferred from: +1. The first `# Heading` in the file +2. The filename (without extension) + +### No Inline Metadata Convention +Plain Markdown has no `key:: value` or property syntax. Any metadata must be in frontmatter or inferred from content. + +## Folder Structure + +In plain Markdown collections, **folder hierarchy serves as the primary organizational structure**. Directories represent categories or topics. + +``` +notes/ + README.md # Root overview + getting-started/ + installation.md + configuration.md + concepts/ + architecture.md + data-model.md + guides/ + deployment.md + troubleshooting.md + reference/ + api.md + cli.md +``` + +**Interpretation rules:** +- Each directory represents a category or topic area +- `README.md` or `index.md` in a directory is that category's overview +- Directory depth indicates topic specificity +- Sibling files within a directory are related by topic + +## Tags + +Plain Markdown has **no native tag syntax**. Tags may appear in: + +### Frontmatter Tags +```yaml +--- +tags: + - tutorial + - advanced +--- +``` + +### Informal Inline Tags +Some authors use hashtag conventions even without tool support: +``` +Topics: #architecture #microservices +``` +These have no standard behavior and should be treated as hints rather than reliable metadata. + +### LLM-Inferred Tags +When no explicit tags exist, the LLM should infer topic tags from: +- The folder path (e.g., `guides/deployment.md` suggests tags: "guide", "deployment") +- The document title and headings +- Key terms in the content + +## Special Files + +| Path | Purpose | Action | +|------|---------|--------| +| `README.md` | Directory overview / project root | **Parse** — often the entry point | +| `index.md` | Directory index | **Parse** — alternative to README | +| `CHANGELOG.md` | Version history | **Deprioritize** — not knowledge content | +| `LICENSE.md` | License text | **Skip** — not knowledge content | +| `CONTRIBUTING.md` | Contribution guidelines | **Deprioritize** — process, not knowledge | +| `_drafts/` | Draft documents | Include but flag as draft | +| `.github/` | GitHub configuration | **Skip** | +| `node_modules/`, `vendor/` | Dependencies | **Skip entirely** | +| `*.min.md` | Minified/generated | **Skip** | + +## Parsing Instructions for LLM + +1. **Confirm fallback**: Verify that no other format's detection criteria match. If unsure, default to this guide. + +2. **Enumerate files**: Recursively find all `.md` files. Exclude common non-content directories: `.git/`, `node_modules/`, `vendor/`, `.github/`, `_site/`, `build/`, `dist/`. + +3. **Infer structure from folders**: The directory tree is the primary organizational signal: + - Map each directory to a category/topic node + - Files within a directory are grouped under that category + - `README.md` / `index.md` files represent the category itself + +4. **Parse frontmatter**: If YAML frontmatter exists (between `---` delimiters), extract all fields. Common: `title`, `date`, `tags`, `author`, `description`. + +5. **Extract title**: In order of priority: + - `title` from frontmatter + - First `# Heading` in the file + - Filename without extension (convert hyphens/underscores to spaces, title-case) + +6. **Extract Markdown links**: Find all `[text](target)` patterns. Resolve relative paths against the file's directory to determine the target: + - Internal links: target is another `.md` file in the collection + - External links: target starts with `http://` or `https://` + - Anchor links: target starts with `#` (same-file heading) + - Image links: prefixed with `!` — `![alt](path)` + +7. **Extract reference-style links**: Find link definitions at the bottom of files: + ``` + [label]: url "title" + ``` + Match these to `[text][label]` references in the body. + +8. **Extract headings**: Parse the heading structure (`#`, `##`, `###`, etc.) to understand the document's internal organization. Use headings to generate summaries and identify subtopics. + +9. **Infer relationships**: Without explicit linking conventions, relationships must be inferred: + - **Explicit links**: Markdown links between files + - **Folder co-location**: Files in the same directory are related + - **Heading similarity**: Notes with similar headings may cover related topics + - **Content overlap**: The LLM should identify topical connections even without links + +10. **Build the graph**: + - **File nodes**: Each `.md` file is a node + - **Category nodes**: Each directory is a grouping node + - **Link edges**: Standard Markdown links between files + - **Hierarchy edges**: File -> parent directory category + - **Inferred edges**: LLM-identified topical relationships (labeled as inferred) diff --git a/understand-anything-plugin/skills/understand-knowledge/formats/zettelkasten.md b/understand-anything-plugin/skills/understand-knowledge/formats/zettelkasten.md new file mode 100644 index 0000000..d005cde --- /dev/null +++ b/understand-anything-plugin/skills/understand-knowledge/formats/zettelkasten.md @@ -0,0 +1,207 @@ +# Zettelkasten Format Guide + +## Detection + +Zettelkasten is a **method**, not a specific tool — implementations vary widely. Identify a Zettelkasten-style knowledge base by its characteristic patterns: atomic notes with unique ID prefixes, a flat or near-flat directory structure, and explicit semantic links between notes. + +**Detection heuristics (match 2+ for high confidence):** +- Note filenames begin with timestamp IDs (e.g., `202604091234 Note Title.md`) +- Notes are predominantly flat (few or no subdirectories) +- Notes are short and focused (typically 100-500 words) +- Notes contain explicit typed links with relationship context +- Frontmatter includes `id`, `type`, or `zettel-type` fields +- A structure note or index note exists linking to other notes by category + +**Common ID formats:** +- `YYYYMMDDHHmm` — timestamp: `202604091234` +- `YYYYMMDDHHMMSS` — full timestamp: `20260409123456` +- Luhmann-style — hierarchical: `1`, `1a`, `1a1`, `1b` +- Incremental — sequential: `0001`, `0002`, `0003` + +## Link Syntax + +Zettelkasten notes use whichever link syntax their host tool supports. The method itself does not prescribe a syntax but emphasizes **context with every link**. + +### Common Patterns + +**Wikilinks (Obsidian, Foam):** +``` +[[202604091234 Note Title]] +[[202604091234 Note Title|short name]] +``` + +**Standard Markdown links:** +``` +[Note Title](202604091234-note-title.md) +``` + +**ID-only references:** +``` +[[202604091234]] +``` + +### Typed / Semantic Links +The defining feature of Zettelkasten linking is that **every link should explain why the connection exists**. Common patterns: + +**Inline context:** +``` +This builds on the concept of attention mechanisms ([[202604081100 Attention Mechanisms]]). +``` + +**Explicit relationship labels:** +``` +- supports:: [[202604091234 Scaling Laws]] +- contradicts:: [[202604081100 Diminishing Returns]] +- extends:: [[202604071500 Original Transformer]] +- example-of:: [[202604061200 Neural Architecture]] +``` + +**Prose context (preferred by purists):** +``` +The findings in [[202604091234]] directly contradict the earlier claims +about diminishing returns described in [[202604081100]], specifically +regarding the relationship between model size and downstream performance. +``` + +### Structure Notes (Hub Notes) +Special notes that serve as entry points or tables of contents: +```markdown +# Machine Learning Concepts + +## Foundations +- [[202604011200 Linear Algebra Basics]] — prerequisite math +- [[202604021300 Gradient Descent]] — core optimization method + +## Architectures +- [[202604031400 Convolutional Networks]] — spatial feature extraction +- [[202604041500 Transformers]] — attention-based architecture + +## Training +- [[202604051600 Backpropagation]] — how networks learn +- [[202604061700 Regularization]] — preventing overfitting +``` + +## Metadata + +### YAML Frontmatter +Zettelkasten notes commonly use frontmatter for metadata: + +```yaml +--- +id: "202604091234" +title: Attention Mechanisms in Transformers +type: permanent # or: fleeting, literature, permanent, structure +created: 2026-04-09T12:34:00 +modified: 2026-04-10T08:15:00 +source: "Vaswani et al. 2017" +tags: + - machine-learning + - attention + - transformers +status: mature # or: seedling, growing, mature +--- +``` + +**Common Zettelkasten-specific fields:** +- `id` — unique identifier (often the timestamp) +- `type` — note type following Zettelkasten conventions: + - `fleeting` — quick captures, to be processed + - `literature` — notes on a specific source + - `permanent` — fully developed atomic ideas + - `structure` — hub/index notes organizing other notes +- `source` — bibliographic reference for literature notes +- `status` — maturity level of the note + +## Folder Structure + +Pure Zettelkasten uses a **flat structure** — all notes in a single directory. Hierarchy is expressed through links, not folders. + +``` +zettelkasten/ + 202604011200 Linear Algebra Basics.md + 202604021300 Gradient Descent.md + 202604031400 Convolutional Networks.md + 202604041500 Transformers.md + 202604051600 Backpropagation.md + 202604091234 Attention Mechanisms.md + index.md # Optional master structure note + assets/ # Optional attachments + diagram-1.png +``` + +**Variations:** +- Some implementations use minimal subdirectories: `fleeting/`, `literature/`, `permanent/` +- Some use `inbox/` for unprocessed notes +- The key principle: structure lives in links, not folders + +## Tags + +Tags supplement but do not replace links. They are used for broad categorization: + +### Frontmatter Tags +```yaml +tags: + - machine-learning + - attention +``` + +### Inline Tags +``` +#machine-learning #attention +``` + +**Zettelkasten tagging principles:** +- Tags should answer: "In what context do I want to find this note?" +- Prefer fewer, broader tags over many specific ones +- Tags are for retrieval, links are for connection +- Avoid tags that duplicate what links already express + +## Special Files + +| Path | Purpose | Action | +|------|---------|--------| +| Index / Structure notes | Hub notes organizing topic areas | **Parse** — these define the high-level knowledge architecture | +| Fleeting notes | Quick captures, unprocessed | **Include but flag** — low maturity | +| Literature notes | Source-specific notes | **Parse** — link to sources | +| Permanent notes | Fully developed ideas | **Parse** — core knowledge nodes | +| `inbox/` or `fleeting/` | Unprocessed captures | Include but deprioritize | + +## Parsing Instructions for LLM + +1. **Detect Zettelkasten**: Look for the characteristic pattern: files with timestamp/ID prefixes, flat directory structure, short focused notes. Check frontmatter for `type` or `zettel-type` fields. + +2. **Identify the ID scheme**: Examine filenames to determine the ID format: + - Timestamp prefix: `202604091234 Title.md` or `202604091234-title.md` + - Luhmann IDs: `1a2b Title.md` + - Numeric: `0042 Title.md` + Extract the ID as the note's stable identifier. + +3. **Classify note types**: From frontmatter `type` field or directory location: + - **Fleeting** — raw captures, lowest priority + - **Literature** — notes about specific sources + - **Permanent** — atomic ideas, highest value + - **Structure** — hub/index notes organizing others + +4. **Parse frontmatter**: Extract `id`, `title`, `type`, `source`, `tags`, `status`, and any custom fields. + +5. **Extract links**: Parse both wikilinks `[[...]]` and standard Markdown links `[text](path)`. For each link: + - Resolve the target note (by ID, filename, or path) + - Capture surrounding context (the sentence or paragraph containing the link) — this context often describes the semantic relationship + +6. **Extract typed relationships**: Look for explicit relationship patterns: + - `relationship:: [[target]]` in properties + - Labels before links: "supports:", "contradicts:", "extends:" + - Contextual prose around links + +7. **Identify structure notes**: Notes that primarily consist of organized lists of links to other notes are structure notes. These define the knowledge architecture and should be parsed as category/grouping nodes. + +8. **Assess atomicity**: True Zettelkasten notes express one idea each. Notes that are unusually long or cover multiple topics may indicate deviation from the method. Flag these for the user. + +9. **Build the graph**: + - **Primary nodes**: Each note is a node, typed by its Zettelkasten category + - **Semantic edges**: Links between notes, labeled with relationship type when available + - **Structure edges**: Links from structure notes to their organized notes (hierarchical) + - **Source edges**: Literature notes linking to their source references + - **Tag edges**: Shared tags create implicit connections + +10. **Prioritize by maturity**: If `status` metadata is present, use it to indicate note maturity in the graph. Permanent/mature notes are the core knowledge; fleeting/seedling notes are peripheral.