Skip to content

Latest commit

 

History

History
500 lines (367 loc) · 21.8 KB

File metadata and controls

500 lines (367 loc) · 21.8 KB

ContextCloak — Project Plan

A Typst package extending Promptyst with Gordian Envelope cryptographic primitives.

Granular privacy, selective disclosure, and verifiable elision — scoped down to the character level.


Problem

Promptyst produces deterministic, structured AI prompts as Markdown. Every field is fully visible to every consumer. There is no mechanism to:

  • Redact sensitive portions (API keys, internal identifiers, user PII) while keeping the prompt structurally valid and verifiable.
  • Selectively disclose context to different agents or pipelines — one agent sees the full prompt, another sees only the role + schema.
  • Prove that a redacted prompt was derived from a signed original without revealing the redacted content.

These are exactly the problems Blockchain Commons' Gordian Envelope specification solves at the data-structure level.


What Is a Gordian Envelope?

A smart document format from Blockchain Commons with the following properties:

Property Description
Merkle-like digest tree Every element (subject, predicate, object, assertion) is individually hashed (SHA-256). The tree of digests is the envelope's identity.
Deterministic CBOR (dCBOR) One semantic meaning → one exact byte encoding. Eliminates ambiguity for hashing and verification.
Holder-initiated elision The holder — not just the issuer — can selectively remove any element. The digest remains, so signatures stay valid.
Progressive trust Reveal more data over time as trust increases. Start with a skeleton; fill in specifics per audience.
Cryptographic agnosticism Core is SHA-256 + dCBOR. Extensions support symmetric encryption, digital signatures, SSKR (sharded secret key recovery), and more.

The key innovation: elision granularity is limited only by how you structure the envelope. If each character is its own assertion, you can elide down to a single character.


ContextCloak's Role in the Stack

promptyst (core DSL)             ← 5 primitives, deterministic Markdown
  → ContextCloak (this package)  ← Gordian Envelope adapter layer
    → runtime / pipeline         ← external consumers

ContextCloak sits as a layer on top of Promptyst, following Promptyst's Boundary.md contract:

  • Imports only Promptyst's public API (10 core symbols + adapters).
  • Never imports src/*.typ directly.
  • Defines its own dictionary types with its own _type namespace.
  • Does not mutate, shadow, or extend Promptyst's core dicts.

Core Concepts

1. Envelope Wrapping

Every Promptyst primitive (context, schema, checkpoint, prompt) can be wrapped in an envelope. Wrapping adds:

  • A digest (hash) for the element.
  • An assertion set — metadata like authorship, timestamps, provenance.
  • A salt — optional, ensures distinct digests for identical content.
#let sealed = cc-wrap(my-prompt)
// sealed.digest = "sha256:ab3f..."
// sealed.assertions = (...)
// sealed.payload = my-prompt

2. Elision (the core feature)

Given a wrapped prompt, elide any field by replacing it with its digest:

#let redacted = cc-elide(sealed, fields: ("ctx", "constraints"))
// redacted still has the same top-level digest
// redacted.ctx = ELIDED("sha256:7d2c...")
// redacted.constraints = ELIDED("sha256:e1f0...")
// redacted.role, .steps, .schema = unchanged

Character-level elision: For maximum granularity, ContextCloak can decompose a string value into an envelope tree where each character (or substring) is an individually addressable assertion. This enables elision scoped down to a single character:

#let fine = cc-granular-wrap(my-prompt.role, grain: "char")
// Each character of the role string becomes its own envelope node
// Elide any subset: cc-elide(fine, indices: (0, 5, 12))

3. Selective Disclosure

Build views — named subsets of an envelope that disclose only specific fields:

#let public-view = cc-view(sealed, disclose: ("id", "version", "role", "schema"))
#let internal-view = cc-view(sealed, disclose: ("id", "version", "role", "ctx", "schema", "steps"))

Both views verify against the same root digest. The holder controls what each consumer sees.

4. Signing and Verification

Attach digital signatures to an envelope. Signatures bind to the digest tree, so they remain valid even after elision:

#let signed = cc-sign(sealed, key: my-signing-key)
#let valid = cc-verify(signed, pubkey: my-public-key)
// valid == true even if fields were later elided

5. Deterministic Rendering

ContextCloak extends Promptyst's renderers. A cloaked prompt renders as valid Markdown with elided fields shown as digest placeholders:

# Prompt: deploy-service
**Version:** 1.0.0

## Role
You are a deployment orchestrator.

## Context: deploy-ctx
`[ELIDED sha256:7d2c…f3a1]`

## Constraints
`[ELIDED sha256:e1f0…bc42]`

## Steps
1. Validate the deployment manifest.
2. Run pre-flight checks.
3. Execute rolling deploy.

## Output Schema: deploy-output
| Field | Type | Description |
|-------|------|-------------|
| status | string | Deployment outcome |

Proposed Primitives

Constructor Parameters Returns
cc-wrap(dict) Any Promptyst dict envelope dict
cc-elide(envelope, fields) envelope + field names or indices envelope with elided fields
cc-granular-wrap(string, grain) string + "char" / "word" / "line" character-level envelope tree
cc-view(envelope, disclose) envelope + field name list view dict (selective disclosure)
cc-sign(envelope, key) envelope + signing key signed envelope dict
cc-verify(envelope, pubkey) envelope + public key boolean
cc-digest(value) any value hex digest string
cc-salt(envelope) envelope salted envelope dict

Renderers

Function Input Output
render-envelope(e) envelope dict Markdown with elision placeholders
render-digest-tree(e) envelope dict visual digest tree (Markdown)
render-view(v) view dict Markdown (disclosed fields only)

Proposed File Structure

ContextCloak/
├── typst.toml                 # package manifest
├── lib.typ                    # public API entrypoint
├── src/
│   ├── envelope.typ           # cc-wrap, cc-salt, cc-digest
│   ├── elision.typ            # cc-elide, cc-granular-wrap
│   ├── disclosure.typ         # cc-view
│   ├── signing.typ            # cc-sign, cc-verify
│   ├── render.typ             # render-envelope, render-digest-tree, render-view
│   └── validate.typ           # internal validation helpers
├── tests/
│   ├── test-wrap.typ
│   ├── test-elision.typ
│   ├── test-disclosure.typ
│   ├── test-signing.typ
│   ├── test-render.typ
│   └── fixtures/
│       └── sample-prompts.toml
├── examples/
│   ├── basic-elision.typ
│   ├── selective-disclosure.typ
│   └── signed-prompt.typ
├── docs/
│   ├── boundary.md            # export boundary contract (like Promptyst's)
│   └── gordian-mapping.md     # how Promptyst primitives map to envelope structures
├── flake.nix
├── LICENSE
└── README.md

Boundary Contract (Draft)

Following Promptyst's layer model:

┌───────────────────────────────────┐
│  Pipeline / Runtime               │  consumes Markdown or CBOR
└────────────────┬──────────────────┘
                 │
┌────────────────▼──────────────────┐
│  ContextCloak (this package)      │  imports promptyst public API
│  lib.typ → src/envelope.typ       │
│          → src/elision.typ        │
│          → src/disclosure.typ     │
│          → src/signing.typ        │
│          → src/render.typ         │
└────────────────┬──────────────────┘
                 │ imports only public symbols
┌────────────────▼──────────────────┐
│  promptyst (core DSL)             │
│  10 immutable symbols + adapters  │
└───────────────────────────────────┘

Rules:

  1. ContextCloak imports via #import "@preview/promptyst:0.2.0": * only.
  2. ContextCloak dict types use _type values prefixed with "cc-" ("cc-envelope", "cc-view", "cc-signed").
  3. ContextCloak never reads or writes Promptyst's _type tags.
  4. Promptyst knows nothing about ContextCloak — the dependency is strictly one-way.

Mapping: Promptyst Primitives → Gordian Envelope Concepts

Promptyst Concept Gordian Envelope Equivalent
prompt dict Envelope subject
context entries Assertions (key-value)
schema fields Assertions (structural)
constraints Assertions (governance)
steps Ordered assertion sequence
checkpoints Conditional assertions
_type tag Envelope case tag (leaf, node, assertion)
render-prompt output Encoded envelope → Markdown view
Field omission Elision (digest placeholder)
TOML ingestion Envelope encoding (dCBOR)

Character-Level Scope ("Down-to-One-Character")

The defining feature of ContextCloak is that elision granularity is bounded only by the envelope tree structure, not by the field boundaries of the prompt DSL.

Standard elision operates at the Promptyst field level — you elide ctx, constraints, or steps as whole units.

Granular elision decomposes any string value into a Merkle tree of individual characters (or words, or lines). This enables:

  • Redact a single token in a constraint: "Keep responses under ███ words"
  • Mask specific context values while preserving keys
  • Prove a role description contains a keyword without revealing the full text

Implementation in Typst:

#let cc-granular-wrap(text, grain: "char") = {
  let units = if grain == "char" { text.clusters() }
              else if grain == "word" { text.split(" ") }
              else { text.split("\n") }
  
  let nodes = units.map(u => (
    _type: "cc-leaf",
    value: u,
    digest: cc-digest(u),
  ))
  
  (
    _type: "cc-granular",
    nodes: nodes,
    digest: cc-digest(nodes.map(n => n.digest).join("")),
    grain: grain,
  )
}

Dependency Package Analysis

Priority axis: security paramount, speed second.


Q1: Hashing — digestify vs jumble

digestify jumble
Implementation WASM plugin (compiled Rust) Pure Typst
Algorithms MD4, MD5, SHA-1, SHA-224, SHA-256, SHA-384, SHA-512 MD4, MD5, SHA-1, HMAC, NTLM, TOTP, Base32
SHA-256 support ✅ Yes ❌ No
Speed Blazing fast (native WASM) Slow (interpreted Typst loops)
API sha256(bytes(text))bytes, bytes-to-hex() md5(text) → hex string
Security Rust crypto crates, compiled to WASM sandbox Hashing implemented in Typst, limited to MD5/SHA-1
Auditability WASM binary — need to trust the build or audit the Rust source Fully readable pure Typst source

Verdict: digestify

Gordian Envelope mandates SHA-256. jumble doesn't support it. Even if jumble added SHA-256, a pure Typst implementation is inherently slower and harder to verify against reference test vectors. digestify's WASM approach runs the crypto in a sandbox with battle-tested Rust implementations. It's both more secure (correct algorithm, proven library) and faster.

jumble's HMAC and TOTP are interesting but not needed for envelope digests. If ContextCloak later needs HMAC (e.g., for key-derived salting), digestify's SHA-256 can be used to build HMAC from primitives.

// ContextCloak internal helper
#import "@preview/digestify:0.1.0": sha256, bytes-to-hex

#let cc-digest(value) = {
  bytes-to-hex(sha256(bytes(repr(value))))
}

Q2: CBOR / Serialization — sertyp vs yats vs typwire

These three packages serve different purposes in the stack. Here's the breakdown:

What each does

sertyp yats typwire
Purpose Serialize/deserialize any Typst type, including content, labels, regex, styles Serialize basic Typst types (dict, array, string, int, datetime) CBOR encoder for passing data to WASM plugins
Format Intermediate repr → CBOR (via serialize-cbor) Custom binary format CBOR
Direction Typst ↔ Typst, Typst → WASM Typst ↔ Typst Typst → WASM (Rust crate companion)
Type coverage Extensive (content, function, module, gradient, stroke, etc.) Basic (none, bool, int, float, string, bytes, array, dict, datetime, regex) Good (int, float, str, bool, bytes, array, dict, color, angle, datetime, etc.)
CBOR output serialize-cbor(value) → bytes ❌ Custom format cbor.encode(value) → bytes
Rust backend rust sertyp crate for WASM plugin development ❌ Pure Typst typwire Rust crate
Implementation Pure Typst (serialize) + optional WASM via Rust Pure Typst Pure Typst (encode) + Rust WASM (decode)

The critical tradeoff: sertyp uses eval() for deserialization

⚠️ Security note from sertyp docs: "Deserialization uses eval() internally. Deserializing untrusted values may therefore lead to arbitrary code execution. Only deserialize trusted data."

This is a showstopper for the security-first axis. If ContextCloak ever deserializes an envelope received from an external source (e.g., verifying a signed prompt from another party), eval() deserialization opens code injection. This defeats the purpose of cryptographic verification.

How they fit together

The three packages aren't competitors — they solve different layers of the problem:

┌─────────────────────────────────────────────────────────────┐
│  typwire                                                     │
│  "I encode Typst values into CBOR bytes for WASM plugins"   │
│  Used to SEND data TO a WASM plugin                         │
└──────────────────────┬──────────────────────────────────────┘
                       │ CBOR bytes
┌──────────────────────▼──────────────────────────────────────┐
│  WASM plugin (e.g., a Rust envelope processor)              │
│  Receives CBOR, does crypto (sign, verify, elide),          │
│  returns CBOR                                               │
└──────────────────────┬──────────────────────────────────────┘
                       │ CBOR bytes
┌──────────────────────▼──────────────────────────────────────┐
│  sertyp (or typwire)                                        │
│  "I decode CBOR bytes back into Typst values"               │
│  ⚠️ sertyp uses eval() — security risk                    │
└─────────────────────────────────────────────────────────────┘

And yats exists independently as a pure-Typst serialization format for Typst-to-Typst round-tripping (no WASM involvement).

Recommendation for ContextCloak

Option A — typwire for encoding + custom WASM plugin ← Recommended

Use typwire to encode Promptyst dicts into CBOR, send them to a custom Rust WASM plugin that handles:

  • Envelope wrapping (digest tree construction)
  • Elision (replace nodes with digests)
  • Signing (Ed25519 or similar)
  • Verification

The WASM plugin returns CBOR, which ContextCloak decodes back. This keeps all crypto in Rust (secure, fast, auditable) and keeps Typst as the orchestrator.

#import "@preview/typwire:0.1.0"
#import "@preview/digestify:0.1.0": sha256, bytes-to-hex

#let envelope-plugin = plugin("cc-envelope.wasm")

#let cc-wrap(prompt-dict) = {
  let encoded = typwire.cbor.encode(prompt-dict)
  let result = envelope-plugin.wrap(encoded)
  // decode result back...
}

Note

typwire's decode path is still listed as a missing feature. We may need sertyp's deserialize-cbor for the return path initially, accepting the eval() risk for self-generated data (not untrusted input). Or we build minimal CBOR decoding into ContextCloak directly.

Option B — digestify only, structural envelopes in pure Typst

Skip CBOR entirely. Use digestify for SHA-256 and implement the envelope tree as pure Typst dictionaries. No WASM plugin beyond digestify. Rendering produces Markdown with digest placeholders.

This is simpler, has fewer dependencies, and avoids the eval() issue entirely — but it means ContextCloak's envelopes are structurally equivalent to Gordian Envelopes, not byte-level compatible. You couldn't exchange them with other Gordian Envelope implementations directly.

Option C — Hybrid (phased)

Start with Option B (digestify + pure Typst envelope trees). Add the WASM plugin (Option A) in a later phase when byte-level dCBOR compatibility or real signing is needed. The cc- API stays the same either way — the implementation behind cc-wrap and cc-sign changes, but the consumer interface doesn't.

Option A Option B Option C
Security ✅ Best (Rust crypto) ✅ Good (digestify WASM) ✅ Good → Best
Speed ✅ Fast ✅ Fast ✅ Fast
Complexity High (custom WASM plugin) Low (pure Typst + 1 dep) Low → High over time
Gordian compat ✅ Byte-level ⚠️ Structural only ⚠️ → ✅
Time to v0.1 Weeks (Rust plugin dev) Days Days → Weeks

My recommendation: Option C (start B, graduate to A). Get a working envelope system fast with digestify, prove the API design, then add the Rust WASM backend when the pipeline needs real dCBOR or external interop.


Q3: Key Management

Given Option C phasing:

Phase B (pure Typst): No real signing. cc-sign produces a structural signature placeholder — a digest of the entire envelope dict. This is useful for detecting tampering but NOT cryptographically secure signing. It's honest about being a placeholder:

#let cc-sign(envelope, key: none) = {
  // Phase B: structural integrity check (not cryptographic)
  let integrity = cc-digest(envelope)
  envelope + (_signed: true, _integrity: integrity)
}

Phase A (WASM plugin): Real Ed25519 signing in Rust. Keys provided at compile time via the Nix pipeline:

agenix secret → /run/agenix/cc-signing-key → TYPST_ROOT env → read() in Typst → pass to WASM plugin

This follows the existing determinate-OCD pattern: secrets by path, never inline.


Q4: Promptyst Version ✅ Resolved

Track >=0.2.0 (latest). ContextCloak's typst.toml will declare:

[dependencies]
promptyst = ">=0.2.0"

Q5: Naming ✅ Resolved

cc- prefix for all ContextCloak symbols. Matches Promptyst's p- convention.


Integration with determinate-OCD

ContextCloak fits into the existing pipeline:

den .description (TOML)
  → nix eval
  → nuenv/Nushell
  → Promptyst/Typst  (compile prompt)
  → ContextCloak     (wrap + elide for audience)
  → agent-native output (Markdown with elision markers)

Different agents receive different views of the same prompt:

  • Internal agents (OpenClaw): full prompt, no elision
  • External agents (CI bots, third-party tools): elided prompt with only role + schema + steps visible
  • Audit trail: signed envelope with full digest tree for verification

Phased Delivery

Phase Scope Depends On
0. Skeleton typst.toml, lib.typ, flake.nix, empty src/*.typ
1. Digest + Wrap cc-digest, cc-wrap, cc-salt Phase 0, SHA-256 decision
2. Field-level Elision cc-elide, render-envelope Phase 1
3. Granular Elision cc-granular-wrap, character-level trees Phase 2
4. Selective Disclosure cc-view, render-view Phase 2
5. Signing cc-sign, cc-verify Phase 1, key management decision
6. Pipeline Integration Nushell bridge, agenix key injection, determinate-OCD wiring Phases 2-5

Non-Goals

  • Runtime envelope processing: ContextCloak is compile-time. It produces static artifacts.
  • Full CBOR wire format: V1 targets Markdown-native envelopes. Byte-level dCBOR is a future format target.
  • Key generation or rotation: Key lifecycle is managed by the infrastructure (agenix, Nix), not by the Typst package.
  • Network transport: Envelopes are files. Transmission is the pipeline's concern.
  • Modifying Promptyst: ContextCloak is a pure consumer of Promptyst's public API. Zero changes to upstream.