This document is the L0 authoritative definition, defining Subject, Header, Payload, and reliability and guardrail constraints.
- Turn: A complete delivery closure (from one Enqueue/user input to the final
task.deliverablecompletion).- Aligned with OpenAI’s
user input -> assistant final responselayer, but allows non-user triggers (for example, internal orchestration/derivation), so it is broader.
- Aligned with OpenAI’s
- Step: A single reasoning/tool iteration within a Turn (a one-step iteration of ReAct), producing cards and events such as
agent.thought/tool.call/tool.result/task.deliverable.- Aligned with OpenAI’s
step/iteration, but here a Step only describes one reasoning iteration and does not equal a full delivery.
- Aligned with OpenAI’s
cg.{ver}.{project_id}.{channel_id}.{category}.{component}.{target}.{suffix}
ver: protocol version (aligned withcore.config.PROTOCOL_VERSION)project_id: tenant/project isolationchannel_id: visibility/auth domain (for example,public/u_{user}/t_{thread}); not a CardBox storage isolation dimension (cards/boxes are still accessed byproject_id)category:cmd/evt/strcomponent:agent/sys/user/tooltarget: routing target (capability group or specificagent_id/ tool provider)suffix: action/type (for example,wakeup/turn/task/chunk)
Unified placeholder example: cmd.agent.{target}.wakeup.
cmdandevtare transported via JetStream (reliable delivery).- Default streams:
cg_cmd_{PROTOCOL_VERSION}:cg.{PROTOCOL_VERSION}.*.*.cmd.>(retain 24h)cg_evt_{PROTOCOL_VERSION}:cg.{PROTOCOL_VERSION}.*.*.evt.>(retain 7d)
- The scope generated by
subject_pattern()matches the above (target/suffixsupport wildcards,suffixcommonly used forwakeup/task/step, etc.). - The true source for semantic progression and replay is still PG/CardBox; JetStream only carries edge events.
struses Core NATS (low latency, lossy)- Used only for streaming output and observability, not for state advancement.
core.nats_client.publish_core/subscribe_coreuse Core NATS directly, without JetStream,queue_group, or persistence;traceparentis not required.- Consumers of
strshould not usestras a state-change trigger.
Aside from str.*, publish_event applies the following behaviors automatically for cmd/evt:
traceparentis mandatory; if missing in input, it is filled.- Optional passthrough/fill:
tracestate. - Auto-injected:
CG-Timestamp/CG-Version/CG-Msg-Type/CG-Sender(only filled when missing). CG-Recursion-Depthremains the core L0 damping constraint; the call chain/entry normalization performs centralized validation and fillback. Missing, non-integer, or negative values are treated asProtocolViolationErrorwhen strict checking is required.
Optional extension: CG-User-ID / CG-Channel-ID / CG-Context-ID (the current implementation does not have unified semantic consumption).
Constraint: avoid placing keys with CG-* prefix in business payloads; current implementation only emits warnings in publish_event and does not hard-reject.
- Scope:
cmd.*/evt.*/ tool callbacks;str.*does not apply. - Invalid handling: when the call chain explicitly requires depth validation, missing, non-integer, or negative
CG-Recursion-DepthtriggersProtocolViolationError. - Increment rule:
- User entry tasks start at
0(for example, the first hop triggered by UI/external entry). enqueuecurrently supports onlymode=call;transfer/notifyreturnProtocolViolationError.enqueueitself does not perform+1; callers that need drill-down semantics should precompute and pass depth in the chain vianext_recursion_depth.enqueueattempts to backfill/validate depth bycorrelation_id; mismatch with previous record oncorrelation_idresults in an error.tool_result/timeoutreportandjoin_responserequire strict matching between depth and the value associated withcorrelation_id.
- User entry tasks start at
- Threshold and rejection:
- Threshold comes from
[worker].max_recursion_depth(default 20). - If
depth >= threshold, publishers or consumers must reject dispatch and halt the call chain.
- Threshold comes from
- Failure semantics (recommended to normalize):
- Recursion exceeded:
evt.agent.{agent_id}.taskwithstatus=failed+error_code=recursion_depth_exceeded. - Protocol violation:
evt.agent.{agent_id}.taskwithstatus=failed+error_code=protocol_violation.
- Recursion exceeded:
- Rules below apply to
cmd.*/evt.*/ tool callbacks;str.*does not apply. traceparentis the core tracing field;trace_idis derived fromtraceparent.trace-idand persisted.- Publish-side ensures
traceparent(generated if missing); changes intraceparentindicate chain handoff. - Tool service callbacks should pass through inbound
traceparent; ExecutionService (Report primitive) is responsible for generating child spans and writing to Inbox. - Do not pass trace id only in payload (payload should be used only for compatibility/debugging).
- fanout/join scenarios usually inherit the original
trace-idto keep traceability across branches. - If calling external HTTP, inject
traceparent/tracestatewhen possible.
- Any state-advancing write must satisfy the
turn_epoch + active_agent_turn_idgate. UPDATE ... WHERE turn_epoch=? AND active_agent_turn_id=?with zero rows affected is treated as stale/partitioned and must stop side effects.
- Worker de-duplicates tool callbacks by
(agent_turn_id, tool_call_id). - Execution side permits duplicate delivery of the same tool_result to tolerate message loss.
Global command-path rule:
- In-boundary
cmd.*publication is centralized in L0 (enqueue/report/join/command_intent/wakeup); business services should not directlypublish_eventtocmd.*. cmd.agent.*.wakeupmust be emitted through L0 wakeup APIs only (not viacommand_intent).- This release uses a moderate payload reduction:
cmd.tool.* / cmd.sys.pmo.internal.*may keep routing fields such astool_call_id/tool_call_card_id/tool_name/after_execution.
- Direction: PMO/ExecutionService -> Worker
- Required:
agent_id - Optional:
inbox_id/reason/metadata - Description: control-plane bell; no semantic data is carried. Semantic requests and receipts are stored in
state.agent_inbox.
- Direction: Tool/PMO -> ExecutionService (writes to Inbox + wakeup)
- Required:
tool_call_id/agent_turn_id/turn_epoch/agent_id - Required:
after_execution(suspend|terminate) - Required:
status(success|failed|canceled|timeout|partial) - Required:
tool_result_card_id(pointer) - Prohibited: inline
result - Description: callback writes to
state.agent_inbox(Report) and is processed from Inbox by Worker. - Description: if
tool.result.content.result.__cg_control.after_executionexists, it overrides a validafter_execution.
- Direction: PMO/BatchManager -> ExecutionService (writes to Inbox + wakeup)
- Required:
agent_id/agent_turn_id/turn_epoch - Optional:
reason
- Direction: Worker -> Tool Service
- Subject comes from
resource.tools.target_subject - Required:
tool_call_id/agent_turn_id/turn_epoch/agent_id - Common optional:
tool_name/after_execution/tool_call_card_id/context_box_id/dispatch_requested_at - Constraint: Tool entry
ToolCommandPayloadforbids directargsby default; usually carriestool_call_card_idand let the card express exact parameters. - Dispatch path: Worker/ToolCaller/API should all go through L0
command_intentto produce signals, then publish after transaction commit.
- Direction: Worker -> PMO
- Semantics aligned with
cmd.tool.*(still resolved throughresource.tools.target_subjecttointernal.*tools), executed by PMO built-in handlers.
- Required:
agent_turn_id/step_id/phase - Optional:
metadata(such asmetadata.timing, see05_operations/observability.md) - Purpose: observability and audit
- Required:
agent_turn_id/status - Optional:
deliverable_card_id/output_box_id/tool_result_card_id/error/error_code/stats(such asstats.timing, see05_operations/observability.md)
- Required:
agent_turn_id/step_id/chunk_type/content - Optional:
index/metadata(such asmetadata.llm_request_started_at/metadata.llm_timing) - Description:
strcarries no trace headers and does not perform tracing; it relies only on payload fields for correlation/observability.
- Required:
agent_id/agent_turn_id/status/turn_epoch/updated_at - Optional:
output_box_id/metadata - Meaning: status-edge signal; PG/CardBox is still the true source.
- Direction: UI -> UI Worker
- Required:
action_id/agent_id/tool_name/args - Optional:
metadata/message_id/parent_step_id - Rules:
action_idis usually the idempotency key;message_idmay be used as an alternative idempotency key.- If
message_idis provided (UUIDv7), UI Worker may use it as idempotency/correlation key. agent_idmust haveworker_target=ui_worker; tools must haveui_allowed=true(default deny).- UI Worker publishes
evt.sys.ui.action_ackas acknowledgment event.
- Direction: UI Worker -> UI
- Required:
action_id/agent_id/status - Optional:
message_id/tool_result_card_id/error_code/error - Meaning: entry acknowledgment / rejection / busy / terminal completion
- Entry-stage acknowledgments are typically
accepted|busy|rejected - After tool execution, the terminal acknowledgment is typically
done; failure cases userejected
- Entry-stage acknowledgments are typically
- evt. is multicast by default*: all events are stored in JetStream. Different business systems (such as PMO and Observer) must use different
durable_namevalues. - NATS maintains an independent cursor per unique
durable_name. This means adding one observer does not affect event delivery for business logic.
- Horizontal scaling support: multiple instances of the same business system should share the same
durable_nameandqueue_group(under Pull Consumer, Durable guarantees this). - Effect: under one
durable_name, the same event is delivered to only one instance, enabling load balancing.
| Type | Physical implementation | Semantic pattern | Persisted | Isolation level |
|---|---|---|---|---|
cmd.* |
JetStream | Work Queue | Yes | Competitive within group (single execution) |
evt.* |
JetStream | Event Stream | Yes | System-level multicast / competitive within group |
str.* |
Core NATS | Broadcast | No | Full broadcast (fire-and-forget) |
cmd.*: stable durable + queue group is recommended.evt.*: replay policy is business-defined; aggregate scenarios may usedeliver_policy=all.
[nats]:servers,tls_enabled,cert_dirmap to NATS client connection config.
[nats.pull]:batch_size,max_inflight,fetch_timeout_seconds,warmup_timeout_secondsmap toNATSClientpull behavior (core/app_config.pyNATSConfigallows additional passthrough fields).
- Auto-generated subject emit points:
../REFERENCE/nats_subjects.generated.md(index only, not a substitute for this semantic definition)