Summary
NCP's adoption pitch is "reduce repeated orchestration cost in agentic systems". A targeted benchmark that compares an LLM-only agent loop against an LLM+NCP loop on the same workflow makes this pitch concrete and defensible.
Acceptance criteria
Why this matters
This benchmark is potentially very powerful if done carefully. It gives adopters a defensible cost-reduction number to take to their teams.
Out of scope
- Closed-source models that cannot be fully scripted.
- Implying NCP is a drop-in replacement for LLMs. The point is to show where deterministic graphs replace orchestration overhead.
Where to read
Summary
NCP's adoption pitch is "reduce repeated orchestration cost in agentic systems". A targeted benchmark that compares an LLM-only agent loop against an LLM+NCP loop on the same workflow makes this pitch concrete and defensible.
Acceptance criteria
lead-qualificationgraph (Add a practical lead-qualification MCP example graph #29) is a natural fit.ncp-mcp-server) that runs the deterministic part of the workflow.bench/ordocs/. Include raw measurement data, not just summary numbers.Why this matters
This benchmark is potentially very powerful if done carefully. It gives adopters a defensible cost-reduction number to take to their teams.
Out of scope
Where to read
BENCHMARK.mdfor the existing benchmark methodology to align withbench/datasets/for the existing dataset format