diff --git a/.cspell-repo-terms.txt b/.cspell-repo-terms.txt
index eb7189759..dbbeb868a 100644
--- a/.cspell-repo-terms.txt
+++ b/.cspell-repo-terms.txt
@@ -152,6 +152,14 @@ Nanvix
reqwest
requireapproval
uncallable
+Falco
+SIEM
+siem
+fanout
+dataschema
+backpressure
+Backpressure
+dataclasses
endswith
rfind
lstrip
diff --git a/docs/proposals/GOVERNANCE-EVENT-SINK-SPI.md b/docs/proposals/GOVERNANCE-EVENT-SINK-SPI.md
new file mode 100644
index 000000000..c4072e685
--- /dev/null
+++ b/docs/proposals/GOVERNANCE-EVENT-SINK-SPI.md
@@ -0,0 +1,321 @@
+
+
+# Governance Event Sink SPI
+
+**Status:** Draft proposal
+**Related issues:** #1999 (this design), #1793 (closed — OS-level enforcement, rejected)
+**Related PR:** #1987 (Copilot-generated reference implementation, under review)
+
+## Summary
+
+Generalize the existing `SpanSink` Protocol pattern from `agent-hypervisor` into
+a first-class **`GovernanceEventSink`** SPI in `agent-os`. AGT becomes a
+structured, signed event producer; enforcement and observability backends
+(Defender, Sentinel, Splunk, Datadog, Falco, Tetragon) plug in as sinks.
+A policy can require a sink class and fail closed if no healthy sink is
+attached, making sink presence an enforceable governance control.
+
+## Goals
+
+- One canonical interface for emitting governance events.
+- Canonical signed event schema (OpenTelemetry semantic conventions inside a
+ CloudEvents 1.0 envelope, with HMAC or Ed25519 signature and monotonic
+ sequence number).
+- Two reference sinks shipped in-tree: `OtlpEventSink` (covers every major
+ SIEM/XDR via OTLP) and `StdoutEventSink` (dev/CI).
+- Policy can require a sink class (`requires_sink: siem`) and fail closed.
+- Vendor-native sinks live as separate optional packages — `agent-os` core
+ takes no vendor SDK dependency.
+
+## Non-goals
+
+- Kernel-level enforcement (eBPF, WFP, kernel drivers) — see #1793.
+- Replacing existing SIEM/EDR tooling.
+- Inventing a new wire format — we adopt OTel + CloudEvents as-is.
+
+## High-level design
+
+```mermaid
+flowchart LR
+ subgraph AGT["AGT runtime"]
+ K[agent-os kernel
policy / identity / audit]
+ H[agent-hypervisor
sandbox / saga]
+ E[Event emitter
SignedGovernanceEvent]
+ K --> E
+ H --> E
+ end
+
+ E -->|emit| I{{"GovernanceEventSink
(SPI)"}}
+
+ I --> O[OtlpEventSink
in-tree]
+ I --> S[StdoutEventSink
in-tree]
+ I --> V[Vendor sinks
separate packages]
+
+ O --> C[OpenTelemetry Collector
or direct OTLP endpoint]
+ C --> D[Datadog / Honeycomb /
Dynatrace / Splunk
native OTLP]
+ C --> M[Sentinel / Defender /
Azure Monitor / CloudWatch
via Collector exporter]
+ V --> X[Direct vendor connectors
e.g. Falco, Tetragon, custom]
+
+ P[Policy
requires_sink: siem] -. enforces .-> I
+```
+
+Event flow:
+
+1. Kernel and hypervisor emit governance events through a single emitter.
+2. The emitter wraps each event in a CloudEvents envelope, signs it, and
+ attaches a monotonic sequence number.
+3. The configured sink(s) receive the signed event and forward it to the
+ downstream backend.
+4. Policy evaluates sink presence and health at startup and at runtime. If a
+ `required_sinks` constraint is unmet, the agent fails closed.
+
+## Event categories
+
+| Category | Emitted on |
+|-----------------------|-----------------------------------------------|
+| `policy.decision` | Every allow/deny decision |
+| `policy.breach` | Runtime policy violation |
+| `identity.assertion` | Agent identity issuance, token exchange |
+| `tool.invocation` | Tool or MCP call attempted, with result |
+| `sandbox.event` | Sandbox lifecycle, resource limit, escape |
+| `audit.chain` | Append to the hash-chained audit log |
+
+## Envelope
+
+CloudEvents 1.0 envelope; payload follows OTel semantic conventions. AGT
+extension attributes:
+
+| Field | Purpose |
+|------------------|-----------------------------------------------------------|
+| `sequence` | Monotonic per `(agent_id, sink)`. Gap = tamper or loss. |
+| `signature` | HMAC-SHA256 (v1) or Ed25519 (v2) over canonical payload. |
+| `prev_hash` | Hash of the previous event — chains the audit stream. |
+| `agent_id` | DID of the emitting agent. |
+| `tenant_id` | Tenant scope. |
+| `policy_version` | Version of the policy bundle in force. |
+
+Why HMAC for v1: zero new dependencies, sufficient for tamper-evidence when
+the signing key is held by AGT and the sink is operated by the customer's SOC.
+Ed25519 follows as v2 for cross-party verification.
+
+## Policy integration
+
+```yaml
+governance:
+ required_sinks:
+ - class: siem # any sink advertising the siem capability
+ health: required # fail closed if unhealthy
+ - class: audit
+ health: required
+```
+
+If no sink of the required class is attached and healthy at startup, the
+kernel refuses to start. If a required sink becomes unhealthy at runtime,
+behavior is policy-controlled (degrade, fail closed, alert only).
+
+## Bypass-resistance
+
+The sink is in-process, so a fully compromised runtime can in principle skip
+emission. Two mitigations make tampering observable:
+
+1. The downstream SIEM expects a steady heartbeat of events. Silence is
+ itself a high-severity signal (standard EDR pattern).
+2. The signed, sequence-numbered, hash-chained envelope means any gap, replay
+ or alteration breaks verification at the sink.
+
+Stronger out-of-process enforcement (Falco, Tetragon, Defender, EDR) is
+delegated to the customer's existing backend, which is exactly the layer it
+belongs in.
+
+## End-to-end flow and crash recovery
+
+```mermaid
+flowchart LR
+ subgraph A["AGT (we ship this)"]
+ K[kernel + hypervisor] --> D[in-process
dispatcher]
+ D --> SO[StdoutEventSink]
+ D --> OE["OtlpEventSink
queue + local spool"]
+ end
+
+ SO --> J[stdout / journald]
+ OE -->|OTLP gRPC/HTTP| C
+
+ subgraph CH["OpenTelemetry Collector"]
+ C[otlp receiver] --> B[batch +
persistent_queue]
+ B --> EX[vendor exporters
configured by customer]
+ end
+
+ EX --> V["Customer's chosen backend(s)
Datadog / Splunk / Sentinel /
Defender / Honeycomb / Dynatrace / ..."]
+
+ classDef agt fill:#e8f0ff,stroke:#3366cc;
+ classDef customer fill:#fff5e6,stroke:#cc7a00;
+ class A agt;
+ class CH,V customer;
+```
+
+**Scope boundaries.** Only the left-hand box is AGT's responsibility — the
+kernel, the dispatcher, and the sinks. Once `OtlpEventSink` pushes OTLP over
+the wire, everything to the right is **customer-operated standard
+OpenTelemetry infrastructure**. AGT does not ship, configure, or operate the
+Collector or any vendor exporter; the customer (or their platform team)
+deploys the Collector as a sidecar, DaemonSet, or remote gateway and points
+its exporters at whichever SIEM/XDR/observability backend they already run.
+
+This gives three clean ownership boundaries: AGT owns event production and
+the SPI; the customer owns Collector deployment and vendor routing; the
+vendor owns the backend.
+
+Two invariants make the pipeline safe across crashes:
+
+1. **At-least-once + idempotent on `(agent_id, sequence)`** — duplicates from
+ retries are harmless.
+2. **Signed, sequence-numbered, hash-chained envelope** — any gap, replay or
+ alteration is detectable at the sink.
+
+Failure modes:
+
+| Failure | Behavior |
+|-------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|
+| Kernel crash | Sequence number is persisted with kernel state; emission resumes from the last seq on restart. SIEM detects gaps via the hash chain. |
+| Sink queue full | `audit` / `siem` block the emitter (fail-closed); `observability` / `debug` drop oldest and emit a `policy.breach` so the drop is seen. |
+| Network blip to Collector | OTLP client retries with exponential backoff; on exhaustion spills to a local spool and replays on reconnect. |
+| Collector crash | The `persistent_queue` extension survives restart; queued events flush on recovery. |
+| Vendor backend down | Per-exporter retry queue in the Collector absorbs the outage; other exporters are unaffected. |
+| Agent host dies entirely | Local spool is gone. SIEM sees a sequence gap and missing heartbeat — high-severity alert (standard EDR silence-as-signal pattern). |
+
+## Where it lives
+
+- **Interface and envelope:** `agent-os` (kernel-level concern).
+- **Reference sinks:** `agent-os` for `StdoutEventSink`; `agent-sre` for
+ `OtlpEventSink` (keeps OTel optional in core). `OtlpEventSink` emits
+ OpenTelemetry Protocol over gRPC or HTTP. Backends with native OTLP ingest
+ (Datadog, Honeycomb, Dynatrace, Splunk Observability) receive events
+ directly; backends without native OTLP (Sentinel, Defender, Azure Monitor,
+ CloudWatch, Elastic) are reached via the OpenTelemetry Collector with the
+ appropriate vendor exporter. Vendor fan-out is the Collector's job, not
+ AGT's. Sensors that produce rather than consume events (Falco, Tetragon)
+ sit alongside the OTLP path as separate vendor sinks.
+- **Vendor sinks:** separate optional packages, e.g. `agt-sink-defender`,
+ `agt-sink-sentinel`.
+- **Hypervisor integration:** `agent-hypervisor` adapts its existing
+ `SpanSink` to bridge into the new event sink so saga and sandbox spans
+ flow through the same pipeline.
+
+## Interface sketches
+
+Python is the canonical shape. Other SDKs mirror it.
+
+### Python (`agent-os`)
+
+```python
+from typing import Protocol, runtime_checkable
+from dataclasses import dataclass
+from enum import StrEnum
+
+class SinkClass(StrEnum):
+ SIEM = "siem"
+ OBSERVABILITY = "observability"
+ AUDIT = "audit"
+ DEBUG = "debug"
+
+@dataclass(frozen=True)
+class SinkHealth:
+ healthy: bool
+ detail: str | None = None
+
+@runtime_checkable
+class GovernanceEventSink(Protocol):
+ name: str
+ classes: frozenset[SinkClass]
+
+ async def emit(self, event: "SignedGovernanceEvent") -> None: ...
+ async def health(self) -> SinkHealth: ...
+```
+
+### .NET (`agent-governance-dotnet`)
+
+```csharp
+public interface IGovernanceEventSink
+{
+ string Name { get; }
+ IReadOnlySet Classes { get; }
+
+ Task EmitAsync(SignedGovernanceEvent evt, CancellationToken ct = default);
+ Task HealthAsync(CancellationToken ct = default);
+}
+```
+
+### Rust (`agent-governance-rust`)
+
+```rust
+#[async_trait::async_trait]
+pub trait GovernanceEventSink: Send + Sync {
+ fn name(&self) -> &str;
+ fn classes(&self) -> &HashSet;
+
+ async fn emit(&self, event: &SignedGovernanceEvent) -> Result<(), SinkError>;
+ async fn health(&self) -> SinkHealth;
+}
+```
+
+### TypeScript (`agent-governance-typescript`)
+
+```ts
+export interface GovernanceEventSink {
+ readonly name: string;
+ readonly classes: ReadonlySet;
+
+ emit(event: SignedGovernanceEvent): Promise;
+ health(): Promise;
+}
+```
+
+### Go (`agent-governance-golang`)
+
+```go
+type GovernanceEventSink interface {
+ Name() string
+ Classes() map[SinkClass]struct{}
+
+ Emit(ctx context.Context, evt SignedGovernanceEvent) error
+ Health(ctx context.Context) SinkHealth
+}
+```
+
+## Decisions
+
+- **Delivery semantics:** at-least-once. Sinks must be idempotent on
+ `(agent_id, sequence)`. The emitter retries with bounded exponential backoff;
+ on permanent failure the event is written to a local spool and replayed on
+ reconnect.
+- **Multi-sink fanout:** parallel. The emitter calls every attached sink
+ concurrently. One sink failing does not block the others. Per-sink failures
+ surface through `health()` and are evaluated by policy.
+- **Signing key management:** bring-your-own. The signing key is supplied via
+ configuration and may be backed by any KMS (Azure Key Vault, AWS KMS, HSM,
+ file). AGT does not generate or rotate keys itself. Key identifier is
+ carried in the envelope so verifiers can resolve the correct key.
+- **Audit log subsystem:** the existing audit log becomes a sink
+ (`AuditChainSink`) that implements the same interface and writes
+ `audit.chain` events to the hash-chained store. The audit log stops being a
+ parallel pipeline and becomes one consumer of the unified event stream.
+- **Schema versioning:** the CloudEvents `dataschema` attribute carries a
+ semver URI (e.g. `https://agt.dev/schemas/governance-event/1.0`). Sinks
+ must accept any minor version they recognize the major of and ignore
+ unknown extension attributes. Breaking changes bump the major.
+- **Backpressure:** bounded in-memory queue per sink (default 10k events).
+ When full, behavior is policy-controlled per sink class — `audit` and
+ `siem` block the emitter (fail-closed semantics); `observability` and
+ `debug` drop oldest with a counter event. The drop counter is itself
+ emitted as a `policy.breach` so a SIEM sees it.
+
+## Next steps
+
+1. Directional review of this proposal by the AGT team.
+2. Resolve open questions and finalize schema.
+3. Break implementation into tickets, using #1987 as the reference branch
+ (after addressing review feedback): Python Protocol + schema → reference
+ sinks → policy integration → SDK ports → docs and examples.