Skip to content

Commit 89d4d37

Browse files
colbymchenryclaude
andauthored
feat(npm): restore programmatic/embedded SDK API (#354) (#603)
The 0.9.x thin-installer turned @colbymchenry/codegraph into a bin-only shim: require("@colbymchenry/codegraph") threw MODULE_NOT_FOUND and no types shipped, breaking embedded library consumers (e.g. Electron apps) upgrading from 0.8.0. Restore programmatic use without re-bloating the thin shim or duplicating the ~49 MB of grammars the per-platform bundle already carries: - main -> npm-sdk.js re-exports the installed per-platform bundle's compiled library (lib/dist/index.js) at runtime, reusing that bundle's own deps; it falls back to a self-healed cache bundle, else throws an actionable error. - types -> ship the .d.ts tree only (~590 KB) in the main package, built from the same release so it can never skew from the runtime it re-exports. - exports map resolves the `types` condition (nodenext) and the default entry. - DatabaseConnection + QueryBuilder are now top-level exports, so embedded callers get the building blocks from the package entry instead of deep dist/ imports (which the shim no longer ships). The CLI/MCP `bin` keeps execing the bundled Node; only library consumers run on their own runtime, which must be Node 22.5+ for the built-in node:sqlite. Validated end-to-end: built a real darwin-arm64 bundle, packed the npm packages, installed them into a throwaway consumer, and confirmed require() plus a full init/indexAll/searchNodes round-trip and the low-level DatabaseConnection/QueryBuilder path all work on the host Node; types resolve under both nodenext and classic node resolution; and the CLI shim still launches. New hermetic tests cover npm-sdk resolution, cache fallback, and the missing-bundle error. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 3a1ddf4 commit 89d4d37

6 files changed

Lines changed: 235 additions & 2 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
1717
- Swift deferred-validation flows (and similar "handler array" patterns) now connect end-to-end in `codegraph_trace` and `codegraph_explore` — following a request's lifecycle reaches the validators registered with `.validate { … }` instead of dead-ending where the framework runs them by iterating a stored list of closures. Any pattern where closures are appended to a collection and later invoked by looping over it is now traced.
1818
- `codegraph_explore` now spells out the dynamic-dispatch relationships of the symbols you ask about — e.g. "the closures registered here are run by `didCompleteTask`" — so the indirect hops you'd otherwise grep to reconstruct are listed alongside the call flow.
1919
- `codegraph_explore` answers multi-phase questions that span a large "god file" far more completely. For a flow like "build, send, and validate a request" — where one big file holds the build chain and the validate logic lives in others — it now keeps every method *on the flow path* in full, collapses the file's off-path methods to one-line signatures, and guarantees each phase's defining file is shown (instead of truncating at a fixed size and dropping whichever phase came last, which sent you to read it by hand). Incidental files that merely name-drop the flow are still trimmed, so the response stays focused on the code that answers the question.
20+
- CodeGraph is usable as an embedded library again: `require("@colbymchenry/codegraph")` and `import` now resolve the programmatic API — the `CodeGraph` class plus building blocks like `DatabaseConnection`, `QueryBuilder`, `initGrammars`, and `FileLock` — so you can drive the graph directly from your own app (for example an Electron process) instead of only through the CLI or MCP server. Embedding runs on your own runtime, so it needs Node 22.5+ for the built-in SQLite. (#354)
2021

2122
### Fixes
2223

README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -500,8 +500,14 @@ When running as an MCP server, CodeGraph exposes these tools to Claude Code:
500500

501501
## Library Usage
502502

503+
CodeGraph can be embedded directly. The npm package re-exports its programmatic
504+
API, so both `import` and `require` resolve the `CodeGraph` class in your own
505+
process — handy for embedding it in an app (e.g. an Electron main process).
506+
503507
```typescript
504508
import CodeGraph from '@colbymchenry/codegraph';
509+
// CommonJS works too:
510+
// const { CodeGraph } = require('@colbymchenry/codegraph');
505511

506512
const cg = await CodeGraph.init('/path/to/project');
507513
// Or: const cg = await CodeGraph.open('/path/to/project');
@@ -520,6 +526,21 @@ cg.unwatch(); // stop watching
520526
cg.close();
521527
```
522528

529+
Lower-level building blocks are exported from the same entry point for callers
530+
that drive the graph directly: `DatabaseConnection`, `QueryBuilder`,
531+
`getDatabasePath`, `initGrammars` / `loadGrammarsForLanguages`, and `FileLock`.
532+
533+
**Embedding requirements**
534+
535+
- Install from npm (`npm i @colbymchenry/codegraph`) so the matching
536+
per-platform package — which carries the compiled library and its
537+
dependencies — is fetched alongside the shim.
538+
- The API runs on **your** runtime, so it needs **Node 22.5+** for the built-in
539+
`node:sqlite` (Electron qualifies when its bundled Node is 22.5+). The CLI and
540+
MCP server are unaffected — they run on the self-contained bundled runtime.
541+
- TypeScript types ship with the package. As with any Node-targeting library,
542+
keep `@types/node` available and `skipLibCheck: true` (the common default).
543+
523544
---
524545

525546
## Configuration

__tests__/npm-sdk.test.ts

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
/**
2+
* Programmatic/embedded SDK entry (`scripts/npm-sdk.js`) tests (issue #354).
3+
*
4+
* The published main package is a thin shim: the CLI `bin` (npm-shim.js) execs
5+
* the bundled Node, while `main` (npm-sdk.js) lets embedded consumers
6+
* `require("@colbymchenry/codegraph")` on their OWN Node by re-exporting the
7+
* compiled library that ships inside the per-platform optionalDependency
8+
* (@colbymchenry/codegraph-<target>/lib/dist/index.js).
9+
*
10+
* These tests stand up a temp main-package dir with a fake platform package as a
11+
* resolvable sibling, then require the SDK in a child process — so resolution,
12+
* the self-heal cache fallback, and the missing-bundle error are exercised
13+
* hermetically with no real bundle, network, or registry.
14+
*/
15+
16+
import { describe, it, expect } from 'vitest';
17+
import { spawnSync } from 'child_process';
18+
import * as fs from 'fs';
19+
import * as os from 'os';
20+
import * as path from 'path';
21+
22+
const SDK_SRC = path.join(__dirname, '..', 'scripts', 'npm-sdk.js');
23+
const target = `${process.platform}-${process.arch}`;
24+
const VERSION = '9.9.9-test';
25+
26+
function mkTmp(label: string): string {
27+
return fs.mkdtempSync(path.join(os.tmpdir(), `cg-sdk-${label}-`));
28+
}
29+
30+
// A temp node_modules with the main package (npm-sdk.js + package.json). The
31+
// fake platform package, when present, is written as a resolvable sibling so the
32+
// SDK's `require.resolve('@colbymchenry/codegraph-<target>/...')` walks to it.
33+
function makeConsumer(): { root: string; mainPkg: string } {
34+
const root = mkTmp('consumer');
35+
const mainPkg = path.join(root, 'node_modules', '@colbymchenry', 'codegraph');
36+
fs.mkdirSync(mainPkg, { recursive: true });
37+
fs.copyFileSync(SDK_SRC, path.join(mainPkg, 'npm-sdk.js'));
38+
fs.writeFileSync(
39+
path.join(mainPkg, 'package.json'),
40+
JSON.stringify({ name: '@colbymchenry/codegraph', version: VERSION, main: 'npm-sdk.js' }) + '\n'
41+
);
42+
return { root, mainPkg };
43+
}
44+
45+
// Write a fake compiled library that exports a sentinel, at the given lib/dist
46+
// root (used both for the platform package and the self-heal cache bundle).
47+
function writeFakeLib(libDistDir: string, sentinel: string): void {
48+
fs.mkdirSync(libDistDir, { recursive: true });
49+
fs.writeFileSync(
50+
path.join(libDistDir, 'index.js'),
51+
`module.exports = { SENTINEL: ${JSON.stringify(sentinel)}, CodeGraph: function CodeGraph() {} };\n`
52+
);
53+
}
54+
55+
function installPlatformPackage(root: string, sentinel: string): void {
56+
const pkgRoot = path.join(root, 'node_modules', '@colbymchenry', `codegraph-${target}`);
57+
writeFakeLib(path.join(pkgRoot, 'lib', 'dist'), sentinel);
58+
fs.writeFileSync(
59+
path.join(pkgRoot, 'package.json'),
60+
JSON.stringify({ name: `@colbymchenry/codegraph-${target}`, version: VERSION }) + '\n'
61+
);
62+
}
63+
64+
// require() the SDK in a child process so each case gets a fresh module cache.
65+
function requireSdk(mainPkg: string, env: Record<string, string> = {}) {
66+
const code =
67+
`try { const m = require(${JSON.stringify(path.join(mainPkg, 'npm-sdk.js'))});` +
68+
` process.stdout.write(JSON.stringify({ sentinel: m.SENTINEL, cg: typeof m.CodeGraph })); }` +
69+
` catch (e) { process.stderr.write(String(e && e.message || e)); process.exit(7); }`;
70+
const r = spawnSync(process.execPath, ['-e', code], {
71+
encoding: 'utf8',
72+
env: { ...process.env, ...env },
73+
});
74+
return { status: r.status, stdout: r.stdout, stderr: r.stderr };
75+
}
76+
77+
describe('npm-sdk programmatic entry', () => {
78+
it('re-exports the installed platform bundle library', () => {
79+
const { root, mainPkg } = makeConsumer();
80+
installPlatformPackage(root, 'platform-lib');
81+
// Isolate from any real self-healed cache on this machine.
82+
const r = requireSdk(mainPkg, { CODEGRAPH_INSTALL_DIR: path.join(root, '.empty-cache') });
83+
expect(r.status).toBe(0);
84+
expect(JSON.parse(r.stdout)).toEqual({ sentinel: 'platform-lib', cg: 'function' });
85+
});
86+
87+
it('falls back to a self-healed cache bundle when the optional dep is absent', () => {
88+
const { root, mainPkg } = makeConsumer(); // no platform package installed
89+
const cacheDir = path.join(root, 'cache');
90+
writeFakeLib(
91+
path.join(cacheDir, 'bundles', `${target}-${VERSION}`, 'lib', 'dist'),
92+
'cache-lib'
93+
);
94+
const r = requireSdk(mainPkg, { CODEGRAPH_INSTALL_DIR: cacheDir });
95+
expect(r.status).toBe(0);
96+
expect(JSON.parse(r.stdout)).toEqual({ sentinel: 'cache-lib', cg: 'function' });
97+
});
98+
99+
it('throws an actionable error when no bundle is installed or cached', () => {
100+
const { root, mainPkg } = makeConsumer(); // no platform package, empty cache
101+
const r = requireSdk(mainPkg, { CODEGRAPH_INSTALL_DIR: path.join(root, '.empty-cache') });
102+
expect(r.status).toBe(7);
103+
expect(r.stderr).toContain(`@colbymchenry/codegraph-${target}`);
104+
expect(r.stderr).toContain('not installed');
105+
expect(r.stderr).toContain('registry.npmjs.org');
106+
});
107+
});

scripts/npm-sdk.js

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
'use strict';
2+
//
3+
// Programmatic / embedded SDK entry for @colbymchenry/codegraph (issue #354).
4+
//
5+
// The CLI/MCP `bin` (npm-shim.js) execs the per-platform bundle's OWN Node 24 so
6+
// the tool never depends on the user's runtime. Embedded library consumers are
7+
// the opposite case: they already run their own Node and just want the compiled
8+
// API — `require("@colbymchenry/codegraph")` returning the CodeGraph class et al.
9+
//
10+
// The compiled library + its production dependencies (web-tree-sitter,
11+
// tree-sitter-wasms, …) ship INSIDE the per-platform bundle, at
12+
// @colbymchenry/codegraph-<platform>-<arch>/lib/dist/index.js
13+
// (with the deps in the sibling lib/node_modules). Re-exporting that bundle keeps
14+
// the main package thin — no second 50 MB copy of the grammars — while making the
15+
// SDK work in the consumer's process. Types are a separate concern: the main
16+
// package ships its own dist/**/*.d.ts tree (pointed at by `types`), built from
17+
// the same release so it can never skew from the runtime it re-exports.
18+
//
19+
// node:sqlite (Node >= 22.5) is required to OPEN a graph, but only lazily inside
20+
// the SQLite adapter — so loading this module is safe on older Node, and the
21+
// node:sqlite requirement surfaces with an actionable error only when a DB is
22+
// actually opened. Heavy extraction additionally wants the bundled launcher's
23+
// --liftoff-only flag (the WASM Zone-OOM guard, issues #293/#298); an embedded
24+
// host that drives large indexing should pass that flag to its own Node.
25+
26+
var path = require('path');
27+
var os = require('os');
28+
var fs = require('fs');
29+
30+
var target = process.platform + '-' + process.arch; // e.g. darwin-arm64, linux-x64
31+
var pkg = '@colbymchenry/codegraph-' + target;
32+
33+
module.exports = require(resolveLibrary());
34+
35+
// Locate the compiled library entry inside the installed per-platform bundle.
36+
// Throws an actionable error (rather than a bare MODULE_NOT_FOUND) when no bundle
37+
// is present, so an embedded consumer knows exactly what to install.
38+
function resolveLibrary() {
39+
// 1) The npm-installed optional dependency — the normal case.
40+
try {
41+
return require.resolve(pkg + '/lib/dist/index.js');
42+
} catch (e) {
43+
/* fall through to the self-healed cache */
44+
}
45+
46+
// 2) A bundle the CLI shim self-healed from GitHub Releases into the cache
47+
// (issue #303). Same node/lib/bin layout as the npm package. We only REUSE a
48+
// cached bundle here — unlike the CLI shim we never trigger a network
49+
// download from inside require(), which must stay synchronous and cheap.
50+
var cached = cachedLibrary();
51+
if (cached) return cached;
52+
53+
throw new Error(
54+
'codegraph: the programmatic API is unavailable because the platform bundle\n' +
55+
'(' + pkg + ') is not installed.\n' +
56+
'The compiled library ships inside that per-platform optional dependency.\n' +
57+
'Fixes:\n' +
58+
' - install from the official npm registry so the matching bundle is fetched:\n' +
59+
' npm i @colbymchenry/codegraph --registry=https://registry.npmjs.org\n' +
60+
' - or run the CLI once (e.g. `npx @colbymchenry/codegraph status`) to\n' +
61+
' self-heal the bundle into ~/.codegraph, then require() will find it.'
62+
);
63+
}
64+
65+
function cachedLibrary() {
66+
try {
67+
var version = require(path.join(__dirname, 'package.json')).version;
68+
var base = process.env.CODEGRAPH_INSTALL_DIR || path.join(os.homedir(), '.codegraph');
69+
var lib = path.join(base, 'bundles', target + '-' + version, 'lib', 'dist', 'index.js');
70+
if (fs.existsSync(lib)) return lib;
71+
} catch (e) {
72+
/* no readable cache → caller reports the install guidance */
73+
}
74+
return null;
75+
}

scripts/pack-npm.sh

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,8 +72,26 @@ for archive in "${archives[@]}"; do
7272
done
7373

7474
# Main shim package.
75+
# npm-shim.js CLI/MCP launcher (execs the bundled Node) — the `bin`.
76+
# npm-sdk.js programmatic/embedded entry (#354): re-exports the installed
77+
# platform bundle's compiled library — the `main`.
78+
# dist/ the .d.ts tree only (types). The runtime .js stays in the
79+
# per-platform bundle so its deps aren't duplicated here.
7580
cp "$ROOT/scripts/npm-shim.js" "$NPM/main/npm-shim.js"
81+
cp "$ROOT/scripts/npm-sdk.js" "$NPM/main/npm-sdk.js"
7682
[ -f "$ROOT/README.md" ] && cp "$ROOT/README.md" "$NPM/main/README.md"
83+
84+
# Ship the type declarations so `types`/`exports.types` resolve. Built from this
85+
# same release, so they can't skew from the runtime npm-sdk.js re-exports.
86+
[ -f "$ROOT/dist/index.d.ts" ] || ( echo "[pack-npm] building dist for .d.ts" >&2 && cd "$ROOT" && npm run build >/dev/null )
87+
ROOT="$ROOT" DEST="$NPM/main" node -e '
88+
const fs=require("fs"), path=require("path");
89+
const src=path.join(process.env.ROOT,"dist"), dest=path.join(process.env.DEST,"dist");
90+
fs.cpSync(src, dest, { recursive:true, filter(s){
91+
try { return fs.statSync(s).isDirectory() || s.endsWith(".d.ts"); } catch (e) { return false; }
92+
}});
93+
'
94+
7795
VERSION="$VERSION" SCOPE="$SCOPE" TARGETS="${targets[*]}" \
7896
node -e '
7997
const fs=require("fs");
@@ -85,8 +103,14 @@ VERSION="$VERSION" SCOPE="$SCOPE" TARGETS="${targets[*]}" \
85103
version: process.env.VERSION,
86104
description: "Local-first code intelligence for AI agents (MCP). Self-contained — bundles its own runtime.",
87105
bin: { codegraph: "npm-shim.js" },
106+
main: "npm-sdk.js",
107+
types: "dist/index.d.ts",
108+
exports: {
109+
".": { types: "./dist/index.d.ts", default: "./npm-sdk.js" },
110+
"./package.json": "./package.json"
111+
},
88112
optionalDependencies: opt,
89-
files: ["npm-shim.js","README.md"],
113+
files: ["npm-shim.js","npm-sdk.js","dist","README.md"],
90114
license: "MIT"
91115
}, null, 2) + "\n");
92116
' "$NPM/main/package.json"

src/index.ts

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,12 @@ import { FileWatcher, WatchOptions, PendingFile, LockUnavailableError } from './
5050

5151
// Re-export types for consumers
5252
export * from './types';
53-
export { getDatabasePath } from './db';
53+
// Storage building blocks for embedded/SDK consumers that drive the graph
54+
// directly (open a DB, run prepared queries) rather than through the CodeGraph
55+
// facade. Exposed from the package entry so they no longer require deep imports
56+
// into dist/ (issue #354).
57+
export { getDatabasePath, DatabaseConnection } from './db';
58+
export { QueryBuilder } from './db/queries';
5459
export {
5560
getCodeGraphDir,
5661
isInitialized,

0 commit comments

Comments
 (0)