Problem
When agents call external tools (MCP servers, APIs, third-party services), there's no runtime mechanism to evaluate whether that tool is behaving normally right now. Static allow-lists don't catch a previously-trusted server that starts returning malformed data, timing out intermittently, or exhibiting anomalous latency patterns. The Agents SDK currently trusts all configured tools equally.
Proposal
Add an optional behavioral trust scoring hook that runs before each tool invocation, using runtime telemetry from the Dominion Observatory — a behavioral trust registry tracking 14,800+ MCP servers with anonymized latency, success rate, and anomaly data.
The hook would:
- Before tool call: query the Observatory for the tool server's current trust score (cached with configurable TTL)
- Policy decision: based on a configurable threshold, either proceed, warn, or block the call
- After tool call: report anonymized telemetry (server_url, success, latency_ms — no prompts, no tool arguments, no user data) back to the Observatory
What this looks like
from agents import Agent, Runner
from dominion_observatory import check_trust, report
# As a guardrail / hook
async def trust_gate(ctx, tool_call):
result = check_trust(tool_call.server_url)
if result.trust_score and result.trust_score < 40:
raise ToolBlockedError(
f"Trust score {result.trust_score} below threshold for {tool_call.server_url}"
)
agent = Agent(
name="my-agent",
tools=[...],
before_tool_call=trust_gate, # proposed hook point
)
Why this matters for agents
- Tool-calling agents are only as reliable as their tools. A coding agent that calls a code-execution MCP server with a trust score of 12 should behave differently than one calling a server scoring 95.
- Multi-agent systems amplify the risk. When Agent A delegates to Agent B which calls Tool C, a behavioral anomaly at Tool C cascades. Runtime trust scoring catches this before the cascade.
- Static security ≠ runtime trust. A tool can pass all static checks (signed, verified publisher, correct schema) and still be operationally degraded. Behavioral trust is the missing signal.
Existing infrastructure
- Observatory: live at
dominion-observatory.sgdata.workers.dev, tracking 14,800+ servers, 87,000+ interactions
- Python SDK:
pip install dominion-observatory — check_trust(server_url) returns score + anomaly flags
- LangChain integration:
ObservatoryTrustCallbackHandler — same pattern, already built
- TypeScript SDK:
@dominion/trust-provider on npm — includes beforeSettle hook for the x402 protocol
- Privacy: no prompts, tool arguments, tool outputs, user IDs, or IPs are sent to the Observatory. Only anonymized telemetry (server_url, success, latency_ms, tool_name, http_status).
Integration points
The cleanest integration would be:
- A
before_tool_call / after_tool_call hook on the Agent or Runner (if not already exposed)
- A
TrustGuardrail that plugs into the existing guardrails system
- Optional: a
trust_policy config on Agent that sets threshold + fail behavior (block / warn / log)
Happy to contribute a PR if there's interest. The Python SDK and the hook pattern are already built — it's a matter of wiring them into the Agents SDK's lifecycle.
References
Problem
When agents call external tools (MCP servers, APIs, third-party services), there's no runtime mechanism to evaluate whether that tool is behaving normally right now. Static allow-lists don't catch a previously-trusted server that starts returning malformed data, timing out intermittently, or exhibiting anomalous latency patterns. The Agents SDK currently trusts all configured tools equally.
Proposal
Add an optional behavioral trust scoring hook that runs before each tool invocation, using runtime telemetry from the Dominion Observatory — a behavioral trust registry tracking 14,800+ MCP servers with anonymized latency, success rate, and anomaly data.
The hook would:
What this looks like
Why this matters for agents
Existing infrastructure
dominion-observatory.sgdata.workers.dev, tracking 14,800+ servers, 87,000+ interactionspip install dominion-observatory—check_trust(server_url)returns score + anomaly flagsObservatoryTrustCallbackHandler— same pattern, already built@dominion/trust-provideron npm — includesbeforeSettlehook for the x402 protocolIntegration points
The cleanest integration would be:
before_tool_call/after_tool_callhook on the Agent or Runner (if not already exposed)TrustGuardrailthat plugs into the existing guardrails systemtrust_policyconfig on Agent that sets threshold + fail behavior (block / warn / log)Happy to contribute a PR if there's interest. The Python SDK and the hook pattern are already built — it's a matter of wiring them into the Agents SDK's lifecycle.
References
beforeSettlehook