plexara-agents
A Go reference implementation for building MCP-driven AI agents.
Status: Draft specification, v0.1
Repository: github.com/plexara/plexara-agents
License: Apache 2.0
Language: Go 1.26+
1. Purpose
plexara-agents is an open source Go project that does two things at once:
- Ships a small, opinionated library (
core) for building local-first AI agents that drive Model Context Protocol (MCP) servers.
- Ships ready-to-run binaries that use the library to demonstrate the Plexara MCP data platform (
txn2/mcp-data-platform), specifically against the public ACME Corp demo deployment.
The project is designed to serve as a reference: anyone reading the source should come away with a clear, idiomatic answer to "how do I build a Go agent that talks to one or more MCP servers using a local model." The Plexara demo is the headline use case, but nothing in core assumes Plexara is the MCP being driven.
Why this exists
The MCP ecosystem has matured fast on the server side. Client tooling, especially in Go, has not. Most public agent code is Python and most public Python agent code optimizes for cloud APIs. A Go project that runs entirely on local inference, treats MCP as a first-class primitive, and is small enough to read end-to-end fills a real gap and makes the case for Plexara's architectural model along the way.
2. Goals and Non-Goals
Goals
- Serve as a readable, idiomatic Go reference for MCP-driven agent construction.
- Run entirely on local model inference for the v1 release. No cloud API support, no fallback paths.
- Drive any MCP server. Plexara is the showcase, not a coupling.
- Provide a clean, minimal
core library that other agents (current and future) can depend on without inheriting CLI or transport concerns.
- Support multiple binaries built on
core: a single-shot CLI (ask), an interactive REPL (repl), and a starter for hosted deployments later.
- Ship with example workflows that exercise the Plexara ACME Corp demo end to end, demonstrating how an agent benefits from Plexara's context enrichment.
Non-Goals (v1)
- Cloud API providers (Anthropic, OpenAI, Bedrock, etc.). Local only.
- Embedded inference. The agent talks HTTP to a local model server (Ollama or MLX), not via CGo or in-process llama.cpp.
- Web UI. The hosted web variant is planned, but
core and the v1 binaries do not assume a browser.
- Plugin system. Composition happens at the Go module level, not via dynamic loading.
- A bespoke prompt framework. Prompts are plain text or templated Go strings, not a DSL.
3. Audience
Two readers, with different reading paths:
The agent builder. Someone who needs to build an MCP-driven agent for their own product. They read core/ top to bottom, treat the binaries in cmd/ as worked examples, and import core into their own project.
The Plexara evaluator. Someone evaluating whether Plexara's MCP design holds up under real agent traffic. They run cmd/ask against the ACME demo, look at the example workflows, and learn the platform by interaction.
The spec, the layout, and the code should serve both.
4. Architecture Overview
+---------------------+ +-----------------------+
| cmd/ask | | cmd/repl |
| (single-shot CLI) | | (interactive TUI) |
+----------+----------+ +-----------+-----------+
| |
+---------------+---------------+
|
+--------v--------+
| core |
| |
| loop session |
| provider mcp |
| event router |
+--------+--------+
|
+--------------+--------------+
| |
+--------v--------+ +---------v---------+
| Local model | | MCP servers |
| (Ollama / MLX) | | (Plexara, etc.) |
+-----------------+ +-------------------+
The agent loop is a function: given a Provider, one or more MCP clients, and a user message, it produces a stream of events until the model signals completion. Everything else is composition around that function.
5. Module Layout
plexara-agents/ # github.com/plexara/plexara-agents
├── core/ # importable library, the heart of the project
│ ├── event/ # event types (TextDelta, ToolCall, Finish, Error)
│ ├── provider/ # Provider interface and adapters
│ │ ├── provider.go # interface + shared types
│ │ ├── openai_compatible.go # works for Ollama, llama.cpp server, MLX, vLLM
│ │ └── testing.go # in-memory fake for tests
│ ├── mcp/ # MCP client wrapper around go-sdk
│ │ ├── client.go # connection lifecycle, multi-server aggregation
│ │ └── catalog.go # tool catalog with toolkit grouping
│ ├── router/ # tool router (toolkit classification, narrowing)
│ ├── session/ # message history, persistence, replay
│ ├── loop/ # the agent loop itself
│ ├── approval/ # tool-call approval gates
│ └── log/ # slog-based structured logging helpers
├── cmd/
│ ├── ask/ # single-shot CLI: one question, one answer
│ └── repl/ # interactive REPL with history and approval prompts
├── examples/
│ ├── acme-revenue/ # end-to-end Plexara ACME demo: revenue by region
│ ├── acme-lineage/ # lineage walk + downstream impact analysis
│ └── multi-mcp/ # one agent, two MCP servers, demonstrating router
├── docs/
│ ├── architecture.md
│ ├── mcp-integration.md
│ ├── prompts/ # canonical prompts kept under version control
│ └── adrs/ # architecture decision records
├── go.mod
├── go.sum
├── Makefile
├── LICENSE
└── README.md
Rules:
core has no dependencies on cmd/ or examples/.
cmd/ binaries are intentionally small (target ~150 lines each). If a binary grows logic worth keeping, it moves into core.
- Examples are runnable. If they break, CI breaks.
6. Core Abstractions
6.1 Events
Everything streamed out of the agent loop is an event. Events are a closed sum type, expressed as an interface with a sealing method.
// core/event/event.go
package event
type Event interface {
isEvent()
}
type TextDelta struct {
Text string
}
func (TextDelta) isEvent() {}
type ToolCallRequest struct {
ID string
Name string
Server string // which MCP server owns this tool
Arguments json.RawMessage // never partial; always complete JSON
}
func (ToolCallRequest) isEvent() {}
type ToolCallResult struct {
ID string
Content []ToolContent // text, image, resource refs
IsError bool
}
func (ToolCallResult) isEvent() {}
type Finish struct {
Reason FinishReason
Usage Usage
}
func (Finish) isEvent() {}
type Error struct {
Err error
}
func (Error) isEvent() {}
Consumers (CLI, REPL, future web server) switch on event type and render accordingly.
6.2 Provider
Defined from the consumer side, narrow on purpose.
// core/provider/provider.go
package provider
type Provider interface {
// Stream returns a channel of events. The channel closes when the request
// ends (success, error, or context cancellation). Implementations must
// never emit a partial ToolCallRequest; they buffer until arguments
// are complete JSON.
Stream(ctx context.Context, req Request) (<-chan event.Event, error)
// Name identifies the provider for logging and diagnostics.
Name() string
}
type Request struct {
Model string
Messages []Message
Tools []Tool
// Sampling parameters; zero values mean provider default.
Temperature *float32
TopP *float32
MaxTokens *int
}
Why a channel and not an iterator: range-over-func iterators are stable and tempting, but channels compose better with select, cancellation, and the existing context-cancellation idiom. An iterator wrapper can be added later if there's demand.
For v1 there is one Provider implementation: OpenAICompatible. Pointing it at http://localhost:11434/v1 gives Ollama. Pointing it at an MLX server, llama.cpp's server, or a future vLLM deployment is a config change. This is deliberate: a single well-tested adapter beats four half-tested ones.
6.3 Session
// core/session/session.go
package session
type Session struct {
ID string
Messages []provider.Message
Created time.Time
Updated time.Time
}
func (s *Session) Append(m provider.Message)
func (s *Session) Truncate(maxTokens int, tokenizer Tokenizer) // sliding window
func (s *Session) Save(w io.Writer) error
func Load(r io.Reader) (*Session, error)
Session is a value type, easy to serialize, easy to replay. Persistence is JSON Lines on disk by default. Replay (feeding a saved session back into the loop) is a first-class operation, useful for debugging and for evaluations.
6.4 MCP Client
Thin wrapper over modelcontextprotocol/go-sdk. Responsibilities:
- Manage connections to one or more MCP servers concurrently.
- Aggregate tool catalogs across servers, namespaced by server name.
- Route tool calls to the right server.
- Reconnect on transport failure with bounded backoff.
- Expose resources and prompts published by each server to the loop.
// core/mcp/client.go
package mcp
type Client struct { /* unexported */ }
type ServerConfig struct {
Name string
Transport Transport // stdio, sse, http
Endpoint string
Headers map[string]string
}
func New(cfgs []ServerConfig, opts ...Option) (*Client, error)
func (c *Client) Connect(ctx context.Context) error
func (c *Client) Close() error
func (c *Client) Catalog() *Catalog
func (c *Client) Call(ctx context.Context, req ToolCall) (ToolResult, error)
func (c *Client) Resources(ctx context.Context, server string) ([]Resource, error)
func (c *Client) Prompts(ctx context.Context, server string) ([]Prompt, error)
Tool names sent to the model are namespaced as server__tool (double underscore separator) to keep them legal under all provider tool-name regexes while remaining trivially parseable.
6.5 Agent Loop
The single most important file in the project. It is short on purpose.
// core/loop/loop.go
package loop
type Config struct {
Provider provider.Provider
MCP *mcp.Client
Router router.Router // nil means "all tools, no narrowing"
Approver approval.Approver // nil means auto-approve
Logger *slog.Logger
MaxSteps int // safety cap on tool-call iterations
}
func Run(ctx context.Context, cfg Config, sess *session.Session, userMessage string) (<-chan event.Event, error)
The loop does this and only this:
- Append the user message to the session.
- Call
Router.Narrow if configured, to pick a relevant subset of tools for this turn.
- Stream from the Provider with the narrowed tool set.
- For each event:
TextDelta: forward.
ToolCallRequest: ask the Approver, then dispatch to MCP, append the result to the session, and re-stream from the Provider.
Finish: forward and exit.
Error: forward and exit.
- Enforce
MaxSteps so a misbehaving model cannot tool-loop indefinitely.
Cancellation, error wrapping, and structured logging happen here, not in providers.
6.6 Tool Router
The router is what makes a 30B local model tractable against a 30+ tool MCP surface. Without it, every turn carries the full catalog and small models drift.
// core/router/router.go
package router
type Router interface {
Narrow(ctx context.Context, userMessage string, catalog *mcp.Catalog) ([]mcp.Tool, error)
}
Two implementations ship in v1:
PassThrough: returns the full catalog. Good for small MCP servers and large models.
ToolkitClassifier: a lightweight first pass against the same Provider that asks "which toolkits are likely needed?" then returns only those toolkits' tools. Plexara's datahub_*, trino_*, s3_*, and memory_* namespaces map cleanly to toolkits.
The classifier prompt and toolkit definitions live in docs/prompts/ so they are reviewable and version-controlled rather than buried in code.
6.7 Approval
Mutating tool calls (writes, deletes, expensive queries) require explicit human approval by default for interactive binaries. Single-shot binaries can be configured to auto-approve, deny, or prompt out-of-band.
// core/approval/approval.go
package approval
type Decision int
const (
Allow Decision = iota
Deny
AllowAll // for this session
)
type Approver interface {
Approve(ctx context.Context, call event.ToolCallRequest) (Decision, error)
}
Standard implementations:
AutoAllow: trust the MCP server's declared permissions.
Interactive: TTY prompt, used by cmd/repl.
Policy: rule-based, e.g. allow all datahub_* reads, prompt on trino_execute, deny on datahub_delete.
7. Local Model Strategy
7.1 Target Model
Primary: Qwen3-30B-A3B at Q4_K_M quantization.
- Mixture of experts: 30B total, 3B active per token.
- Memory footprint around 17 to 18 GB at Q4, leaving room for context and OS on a 32 GB machine.
- Reliable tool-call emission in OpenAI-compatible format via Ollama or MLX.
- Strong enough at SQL generation to drive Trino through Plexara, especially with schema enrichment in context.
Validated alternatives (docs/models.md documents results):
- Qwen3-32B (dense): better single-pass reasoning, slower decode, similar memory.
- Mistral Small 3.2 (24B): solid generalist, slightly weaker at chained tool use.
- gpt-oss-20b: smaller footprint baseline.
Models below 14B parameters are not recommended for this agent. Their tool-call reliability falls off on chains longer than two or three steps, which is exactly the workload Plexara generates.
7.2 Runtime
Default: Ollama. One install command, OpenAI-compatible at localhost:11434/v1, ships on macOS, Linux, and Windows.
Documented alternative: mlx-lm's OpenAI-compatible server on Apple Silicon for roughly 1.5x to 2x faster decode at equivalent quants.
Both are reached through the same OpenAICompatible provider. Switching is a config change.
7.3 What the agent does NOT do
- Does not embed a tokenizer. Token counting for sliding-window truncation uses a length heuristic in v1 and a model-aware tokenizer (via a small HTTP
/tokenize call to the runtime when available) in v1.1.
- Does not manage model downloads.
ollama pull qwen3:30b-a3b-q4_K_M is a documented prerequisite, not a runtime concern.
- Does not implement speculative decoding, KV cache management, or other inference-layer concerns. Those belong in the runtime.
8. MCP Integration
8.1 Context Enrichment Is on the Server Side
Plexara MCP performs context enrichment as part of its protocol surface: tool descriptions carry domain context, datahub_get_schema and datahub_get_glossary_term exist as first-class tools, lineage is queryable, and resources expose curated views over the catalog. The agent in this project does not implement enrichment. It consumes whatever the MCP server presents.
This is the right division of responsibility. The MCP server has the catalog, the lineage graph, the access policies, and the domain knowledge. Putting enrichment logic in the agent would duplicate it and couple the agent to one MCP's data model.
What the agent does instead:
- Surfaces the full enriched tool catalog to the model after toolkit narrowing, so descriptions written by Plexara reach the model verbatim.
- Calls discovery tools eagerly when the toolkit classifier suggests they are relevant. For Plexara that means
datahub_search and datahub_get_schema are likely first calls when the user asks about data, before the model attempts a trino_query.
- Exposes MCP resources and prompts to the loop so the model can pull curated context (resource reads, prompt templates) when the server offers them.
The Plexara examples make this concrete. examples/acme-revenue/ shows a turn where the model issues datahub_search then datahub_get_schema before writing SQL, because the toolkit classifier surfaced both tools and Plexara's tool descriptions made their purpose obvious. No agent-side enrichment code is required for that to work.
8.2 Tool Routing
See section 6.6. Tool routing is the agent's responsibility because it depends on the user message and the conversation, not on the MCP server's catalog alone. Once narrowed, the agent passes the selected tools (with their server-supplied descriptions intact) to the model.
8.3 SQL Safety Pattern
For Plexara and any MCP exposing SQL execution, the agent ships with an optional SQLValidator middleware that:
- Intercepts
*_query and *_execute tool calls.
- Calls the corresponding
*_explain tool first.
- If EXPLAIN fails, returns the error to the model as a tool result so the model self-corrects.
- If EXPLAIN succeeds, lets the original call proceed.
This costs one extra round trip and dramatically reduces the rate at which local models produce broken queries. It is opt-in but enabled by default in the Plexara examples.
This is a client-side pattern, not enrichment. The MCP server already provides _explain; the agent just orchestrates the two-step.
8.4 Tool-Call Streaming Discipline
The agent never parses partial tool-call JSON. The OpenAICompatible provider buffers tool-call deltas internally and emits a ToolCallRequest event only when the runtime signals the call is complete (finish_reason: tool_calls or equivalent). This is non-negotiable. Half-parsed tool calls are the single most common source of agent flakiness in the wild.
9. CLI Surfaces
9.1 cmd/ask
Single-shot. One question in, streamed answer out, exit.
ask --model qwen3:30b-a3b --mcp plexara-acme \
"Top five products by revenue in the West region last quarter"
Flags:
--model: model name passed to the runtime.
--mcp: named MCP config from ~/.config/plexara-agents/config.yaml, or a path.
--no-router: disable toolkit narrowing, send the full catalog every turn.
--auto-approve: bypass approval prompts (for scripted use).
--session FILE: append to or replay a session file.
--json: structured event output for piping into tooling.
9.2 cmd/repl
Interactive. Multi-turn session with approval prompts, history, and slash commands.
Slash commands (planned):
/tools list narrowed tools for the next turn
/catalog show full catalog from all connected MCPs
/save FILE, /load FILE session persistence
/explain show the last tool call's parameters and result
/prompt print the assembled system prompt for the next turn
Implementation uses bubbletea for the TUI layer. This pulls in a real dependency, but bubbletea is the de facto Go TUI library and the alternative (raw terminal handling) is not where reference-quality code should be spent.
10. Configuration
YAML, located at ~/.config/plexara-agents/config.yaml by default, overridable with --config.
defaults:
model: qwen3:30b-a3b
provider: ollama-local
router: toolkit-classifier
approval: interactive
providers:
ollama-local:
type: openai-compatible
base_url: http://localhost:11434/v1
api_key_env: OLLAMA_API_KEY # optional, ignored if unset
mlx-local:
type: openai-compatible
base_url: http://localhost:8080/v1
mcp_servers:
plexara-acme:
transport: http
endpoint: https://mcp-demo.plexara.io
Configuration values are also overridable via flags and environment variables. Precedence: flag > env > config file > built-in default. No silent overrides.
11. Observability
Structured logging via log/slog, JSON by default, text in TTYs.
Every tool call emits a log line with: server, tool name, argument digest (hashed, not raw), latency, success or error class. Resource fetches and enrichment calls do the same.
Optional OpenTelemetry tracing behind a build tag. Off by default to keep the dependency tree light. When on, the agent loop emits one span per turn with child spans for enrichment, provider streaming, and each tool call.
A --debug flag dumps the full assembled system prompt, the narrowed tool list (with the MCP server's descriptions), and the messages sent to the Provider to stderr before each turn. This is the single most useful debugging affordance.
12. Error Handling
Conventions:
- All errors wrap with
fmt.Errorf("%w", ...) and never lose the original.
- Sentinel errors live next to the package that owns them (
mcp.ErrServerUnavailable, provider.ErrModelNotFound, etc.).
- The agent loop converts internal errors into
event.Error for streaming consumers, but also returns them from Run for callers that want to handle errors imperatively.
- Network and tool errors are not fatal by default. The model sees the error as a tool result and is free to recover. Programmer errors (bad config, missing model) are fatal.
13. Testing
Three layers, plus fuzzing.
Unit tests. Each package, table-driven where it makes sense. The Provider interface has a provider/testing.Fake implementation that lets the loop, router, and approval be tested without any network or model. All tests run with -race and -shuffle=on in CI.
Integration tests. Run against a real MCP server and a real local model. Gated behind a build tag (//go:build integration) and an INTEGRATION=1 env var. Run on a self-hosted CI runner with Ollama and Qwen3 pre-pulled, and locally during development. Never block the standard PR pipeline.
Replay tests. Saved sessions in testdata/sessions/ are replayed against a recorded Provider transcript. This catches regressions in the loop, router, and tool dispatch without needing a live model or network. New examples must ship with at least one replay test; the spec is enforced by CI.
Fuzz tests. Native Go fuzzing for parsers and serializers: tool-name namespacing, event JSON marshaling, MCP server response handling, session file decoding. Fuzz corpora committed under testdata/fuzz/. CI runs short fuzz cycles per PR; longer scheduled runs catch regressions over time.
Coverage is measured with -covermode=atomic and uploaded to Codecov. Quality gates are defined in section 14.
14. CI, Security, and Repository Standards
This is an OSS reference project. The CI surface, supply-chain posture, and repository hygiene are part of what's being demonstrated. Standards align with what matured projects in the same space ship (kubefwd's pipeline is the reference baseline).
14.1 Repository Hygiene
Files committed at the repo root or under .github/:
LICENSE (Apache 2.0).
NOTICE if any third-party attribution is required.
README.md with status badges (build, coverage, Go Report Card, Scorecard, license, latest release).
CONTRIBUTING.md describing local dev setup, commit conventions, and the PR process.
CODE_OF_CONDUCT.md (Contributor Covenant 2.1).
SECURITY.md with a vulnerability disclosure policy, supported-versions matrix, and a security contact (security@plexara.io or equivalent). GitHub Private Vulnerability Reporting enabled.
CODEOWNERS mapping directories to maintainers.
.github/ISSUE_TEMPLATE/ with bug, feature, and security-redirect templates.
.github/PULL_REQUEST_TEMPLATE.md.
.github/dependabot.yml configured for gomod, github-actions, and docker (if applicable).
Branch protection on main:
- Required PR review (at least one approver, code owner review for owned paths).
- Required status checks: build, test, lint, security, codeql, govulncheck, dependency-review.
- Linear history required (squash or rebase, no merge commits).
- Signed commits required.
- Force-pushes blocked.
- Auto-merge enabled for Dependabot PRs that pass all checks.
14.2 Continuous Integration
Workflows under .github/workflows/:
ci.yml: build, vet, lint, test (race + coverage), go mod verify, go mod tidy -diff. Runs on every PR and every push to main.
security.yml: gosec, govulncheck, Semgrep. Runs on every PR and on a weekly schedule.
codeql.yml: GitHub CodeQL with the security-extended and security-and-quality query packs. Runs on every PR, every push to main, and weekly.
scorecard.yml: OpenSSF Scorecard. Runs weekly and on main pushes; uploads SARIF to the security tab and publishes results.
dependency-review.yml: blocks PRs that introduce dependencies with known vulnerabilities or non-permissive licenses.
release.yml: triggered on v*.*.* tags; runs GoReleaser, generates SBOMs, signs artifacts, attaches SLSA provenance.
fuzz.yml: runs Go fuzz targets for an extended cycle (e.g., 5 minutes per target) on a nightly schedule. Failures open issues automatically.
All jobs run on the latest stable Ubuntu runner image. macOS jobs cover the Apple Silicon developer path for the Ollama/MLX integration tests.
14.3 Test and Coverage Gate
Standard test invocation in CI:
go test -race -shuffle=on -count=1 \
-covermode=atomic -coverprofile=coverage.out \
./...
- Coverage uploaded to Codecov on every CI run. Codecov badge in README.
- Coverage gate: >80% of statements for
core/.... PRs that drop coverage below the threshold fail CI.
cmd/... and examples/... are exercised but excluded from the gate; they exist primarily as worked examples and integration scaffolding.
-race is mandatory in CI and recommended locally.
-shuffle=on to surface order-dependent tests.
- Replay tests live under
testdata/ and run as part of the standard suite.
- Integration tests gated behind the
integration build tag run only on the self-hosted runner.
14.4 Build
go build runs as a separate step before tests, on its own to fail fast on compile errors before the longer test suite kicks off. Build verification includes:
go build ./... for every supported platform via a matrix (darwin/arm64, darwin/amd64, linux/amd64, linux/arm64).
go vet ./....
go mod verify.
go mod tidy -diff to confirm go.mod and go.sum are clean (no hidden drift).
gofmt -l . to fail on unformatted files.
- Build flags for release artifacts:
-trimpath, ldflags -s -w only on release binaries (debug builds keep symbols).
- CGO disabled (
CGO_ENABLED=0) for portability.
14.5 Linting
golangci-lint with a comprehensive, opinionated configuration in .golangci.yml. Enabled linters (in addition to the default set):
errcheck, errorlint, errname, govet, ineffassign, staticcheck, unused, gosimple, gofmt, goimports, misspell, revive, gocritic, gocyclo, gocognit, gosec, prealloc, unconvert, unparam, copyloopvar, intrange, nilerr, nilnil, contextcheck, durationcheck, exhaustive, gomoddirectives, gomodguard, importas, predeclared, whitespace, godot, dupl, nolintlint.
Specific rules:
gocyclo cyclomatic complexity threshold 15.
gocognit cognitive complexity threshold 20.
dupl set to flag genuine duplication, with a high enough threshold that table-driven test rows don't trip it.
nolintlint enforces that every //nolint: directive includes a reason.
gosec runs in lint mode here; a separate gosec job (section 14.6) runs with stricter settings.
The lint job fails CI on any new finding. Existing legitimate exceptions are listed inline with //nolint:<linter> // <reason>.
14.6 Security Scanning
Multiple complementary scanners. They overlap intentionally; coverage gaps in one are filled by another.
gosec: dedicated job using securego/gosec GitHub Action with full ruleset. Findings as SARIF uploaded to the security tab.
govulncheck: official golang.org/x/vuln/cmd/govulncheck against ./... on every CI run. Failures block the merge.
- Semgrep:
returntocorp/semgrep-action with rulesets p/security-audit, p/secrets, p/golang, p/owasp-top-ten. Findings posted as PR comments and uploaded as SARIF.
- CodeQL:
github/codeql-action with go language, security-extended and security-and-quality query packs.
- Trivy:
aquasecurity/trivy-action filesystem scan for misconfigurations and secrets. Container scan added when the project starts publishing images.
All scanner outputs go through GitHub's SARIF interface so findings are visible in the security tab and surfaceable in PR review.
14.7 Supply Chain Security
- OpenSSF Scorecard:
ossf/scorecard-action weekly. Target score >=8.0. Score badge in README.
- SLSA Level 3 provenance:
slsa-framework/slsa-github-generator produces signed provenance for every release artifact.
- SBOM: generated via
anchore/sbom-action (Syft) in both CycloneDX and SPDX formats; attached to every GitHub release.
- Cosign keyless signing: every release artifact, every container image, every SBOM signed via Sigstore OIDC. Verification commands documented in
SECURITY.md.
- Reproducible builds:
-trimpath, fixed module cache, frozen BUILD_ID from the tag's commit. Documented procedure for third-party reproduction.
- License scan:
google/go-licenses blocks PRs that introduce dependencies with non-permissive licenses (anything not in an allowlist of MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, MPL-2.0, etc.).
14.8 Releases
Releases are tag-driven. Cutting vX.Y.Z triggers release.yml, which runs GoReleaser.
- Strict semantic versioning. Pre-1.0 tags signal API instability.
- GoReleaser config at
.goreleaser.yaml.
- Cross-compiled binaries:
darwin/arm64, darwin/amd64, linux/amd64, linux/arm64, windows/amd64.
- Each binary signed with cosign and accompanied by a
.sig and a .cert.
- SBOMs (CycloneDX and SPDX) attached.
- SLSA Level 3 provenance attached.
- Conventional Commits enforced for changelog generation. PR titles validated by a CI check using
commitlint or equivalent.
- Container images (when introduced) published to
ghcr.io/plexara/plexara-agents with cosign signature and SBOM.
- Homebrew tap formula updated automatically by GoReleaser for the
ask and repl binaries.
14.9 Dependency Management
- Dependabot for
gomod and github-actions, weekly schedule, grouped updates for non-major bumps.
- Renovate considered as an alternative; for v1 stick with Dependabot to keep the toolchain native to GitHub.
govulncheck provides the runtime safety net for transitive vulnerabilities Dependabot cannot reach.
- No vendoring. Modules are resolved from the proxy and verified via
go.sum and go mod verify.
14.10 Action Pinning
Every third-party GitHub Action pinned to a full commit SHA, with the human-readable version as a trailing comment:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
- uses: golangci/golangci-lint-action@aaa42aa0628b4ae2578232a66b541047968fac86 # v6.1.0
This is non-negotiable: SHA pinning is what closes the supply-chain attack surface that tag-pinning leaves open. Dependabot updates the SHAs on its weekly cadence; reviewers verify the new SHA points at the same release the comment claims.
First-party actions (actions/*, github/*) follow the same rule. The Scorecard Pinned-Dependencies check enforces this; missing pins fail Scorecard.
14.11 Pre-commit and Local Tooling
.pre-commit-config.yaml with hooks: trailing-whitespace, end-of-file-fixer, check-yaml, check-json, check-added-large-files, mixed-line-ending.
- Local
make targets that mirror CI: make build, make test, make lint, make sec, make cover, make tidy. Developers run these before pushing.
- A
tools.go file under internal/tools/ pins versions of the dev-only tools (golangci-lint, goimports, govulncheck) so that go install produces reproducible local toolchains.
- pre-commit hooks are a developer convenience; CI is the source of truth.
14.12 Frontend Build (where applicable)
For v1, no frontend exists. The core library is consumed by terminal binaries.
When the hosted web variant lands (separate repository, see section 19), the standards inherited and enforced are:
- TypeScript with
"strict": true and noUncheckedIndexedAccess.
- Vite or equivalent for build, with type-only imports enforced.
- ESLint with
@typescript-eslint/strict-type-checked and security plugins.
- Prettier for formatting.
tsc --noEmit typecheck as a CI gate.
- Dependency scanning via
npm audit, osv-scanner, and Dependabot.
- Same CodeQL, Semgrep, Scorecard, and Cosign posture as the Go side.
This section is documented now so the future repository is set up correctly from day one.
14.13 Badges
Badges in README.md, in this order:
- CI status
- Coverage (Codecov)
- Go Report Card
- OpenSSF Scorecard
- Latest release
- License
- Go reference (pkg.go.dev)
- Slack/Discord/Discussions (if community channels exist)
15. Coding Standards
- Go 1.26+. We use range-over-func iterators where they earn their keep, generics where they remove duplication, and neither for show.
gofmt, goimports, and golangci-lint clean. Lint config in .golangci.yml, conservative rule set.
- Public APIs documented with full sentences. Private functions documented when they are non-obvious, not as a rule.
- Interfaces defined at the consumer, not the producer. The
Provider interface lives in core/provider because the loop, the router, and tests all consume it; defining it inside core/loop would be wrong. When in doubt, define small.
- No package-level mutable state. Constructors return values; lifetimes are explicit.
context.Context is the first parameter of any function that does I/O. No exceptions.
- Errors are values; panics are bugs. The agent loop recovers from panics in tool handlers and surfaces them as
event.Error, but does not recover from panics in the Provider or core. Those crash by design.
internal/ is used freely. If a package is not meant to be imported by downstream consumers, it goes there.
Idiom we are deliberately avoiding
Functional options for the Provider and MCP client are appealing but cost readability. We use plain config structs with sensible zero values instead. Functional options are reserved for places where a long tail of optional knobs really exists (the agent loop's Config).
16. Dependencies
16.1 Runtime
A small list, intentionally:
github.com/modelcontextprotocol/go-sdk MCP client (official, jointly maintained by Anthropic and Google).
github.com/charmbracelet/bubbletea REPL TUI (only pulled in by cmd/repl, not by core).
gopkg.in/yaml.v3 config parsing.
golang.org/x/sync/errgroup concurrent MCP server connection management.
That's it for core. The OpenTelemetry path adds dependencies behind a build tag.
No HTTP framework. The standard library is fine for the OpenAI-compatible client and any future server.
No agent framework. We are the agent framework.
16.2 Developer and CI Tooling
Pinned via internal/tools/tools.go (the standard //go:build tools pattern) so go install produces reproducible local toolchains:
github.com/golangci/golangci-lint/cmd/golangci-lint
golang.org/x/tools/cmd/goimports
golang.org/x/vuln/cmd/govulncheck
github.com/securego/gosec/v2/cmd/gosec
github.com/google/go-licenses
github.com/anchore/syft/cmd/syft (SBOM, used by GoReleaser)
github.com/sigstore/cosign/v2/cmd/cosign (signing, used by release workflow)
CI installs these from the same pinned versions. Local make tools produces an identical toolchain.
17. Plexara Demo Workflows
examples/acme-revenue/ is the headline. It demonstrates:
- User question: natural language revenue query.
- Toolkit classifier narrows to
datahub_* and trino_*.
- Model sees Plexara's enriched tool descriptions and chooses to call
datahub_search and datahub_get_schema to ground itself before writing SQL.
- Model issues
trino_explain (via the SQL safety wrapper), then trino_query.
- Result formatted as a small table in the response.
- Saved session replayable as a regression test.
The point of this example is to show that an unmodified, generic agent driving a richly enriched MCP produces good results. The MCP earns its keep; the agent stays small.
examples/acme-lineage/ demonstrates the lineage walk: a question about downstream impact triggers datahub_get_lineage calls, the model assembles a small graph in its response, and Plexara's glossary tools (datahub_get_glossary_term) come into play when terms need defining. Again, the agent does not coordinate this; the model drives it because Plexara's tool descriptions make the path obvious.
examples/multi-mcp/ connects to two MCP servers at once (Plexara ACME and a second small server, possibly Filesystem from the official examples) and demonstrates that namespacing and routing work cleanly across servers.
Each example has a README, a main.go that's small enough to read in one sitting, and a recorded session for replay testing.
18. Roadmap
v0.1 (initial public release)
Code:
core/event, core/provider/openai_compatible, core/mcp, core/session, core/loop, core/router/{passthrough,toolkit_classifier}, core/approval.
cmd/ask.
examples/acme-revenue with replay test.
Documentation:
- README with full badge set.
- Architecture doc.
- One ADR (provider model choice and the decision to ship only
openai_compatible in v1).
CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md, CODEOWNERS, issue and PR templates.
CI and supply chain (the entire section 14 baseline):
ci.yml, security.yml, codeql.yml, scorecard.yml, dependency-review.yml, release.yml, fuzz.yml.
- All third-party actions SHA-pinned with version comments.
- Coverage gate at >80% for
core/..., Codecov upload.
- gosec, govulncheck, Semgrep, CodeQL, Trivy fs scan all green.
- OpenSSF Scorecard >=8.0.
- GoReleaser config with cosign signing, SBOM, SLSA Level 3 provenance.
- Dependabot configured for
gomod and github-actions.
- Branch protection on
main enforced.
- Pre-commit config and Makefile.
v0.2
cmd/repl with TUI and slash commands.
- SQL safety middleware.
examples/acme-lineage, examples/multi-mcp.
- Documented MLX runtime path.
- Session save/load.
v0.3
- Policy approver.
- Token-aware session truncation via runtime
/tokenize.
- OpenTelemetry tracing behind a build tag.
v1.0
- Documented stability for the
core API.
- Comprehensive ADRs for every significant design decision.
- Benchmarks comparing Qwen3-30B-A3B against alternatives on the ACME workflows.
Post-1.0
A separate repository (plexara-agents-server or similar) builds a multi-user web service on top of core. That is a different project with a different deployment story; this one stays single-user, local-first, terminal-native.
19. Future: Hosted Multi-User
Out of scope for this repository, but worth recording the shape so core does not paint into a corner.
When the hosted variant is built, core will be imported unchanged. The differences are at the edges:
- Runtime: vLLM or sglang behind an OpenAI-compatible endpoint, batching concurrent sessions for throughput.
- Hardware: single L40S (48 GB) or dual RTX 5090 reasonable starting point.
- Concurrency: one
loop.Run call per HTTP session, with sessions persisted to Postgres or similar.
- Approval: policy-based (no human at the terminal), with sensitive operations rejected outright or escalated to an out-of-band review.
- Auth: per-tenant MCP server selection; each user's
mcp.Client is built from their authorized server list.
If the v1 core API supports all of that without modification, we did our job.
20. Open Questions
These are deliberately unresolved and should be settled before implementation begins.
- Toolkit classifier prompt. The first version will be hand-tuned for Plexara's namespaces. Should a generic toolkit-classification prompt ship in
core, or should each MCP user supply their own? Leaning toward "shipped generic, override per MCP."
- Tool-name namespacing separator. Double underscore (
server__tool) is safe but ugly. Other MCP clients have used colon and dot. The choice locks in: do we want to align with any community-emerging convention? Worth surveying before locking in.
- Resource handling. MCP resources are first-class but the model has no native way to consume "resource references" mid-stream. v1 will inline small resources into the system prompt and link larger ones; v2 may grow a richer resource handler. Worth an ADR.
- Persistence format. JSON Lines for sessions is simple and tooling-friendly but verbose. Some hybrid (JSONL for messages, sidecar for metadata) may end up cleaner. Decide before v0.2.
21. References
plexara-agents
A Go reference implementation for building MCP-driven AI agents.
Status: Draft specification, v0.1
Repository:
github.com/plexara/plexara-agentsLicense: Apache 2.0
Language: Go 1.26+
1. Purpose
plexara-agentsis an open source Go project that does two things at once:core) for building local-first AI agents that drive Model Context Protocol (MCP) servers.txn2/mcp-data-platform), specifically against the public ACME Corp demo deployment.The project is designed to serve as a reference: anyone reading the source should come away with a clear, idiomatic answer to "how do I build a Go agent that talks to one or more MCP servers using a local model." The Plexara demo is the headline use case, but nothing in
coreassumes Plexara is the MCP being driven.Why this exists
The MCP ecosystem has matured fast on the server side. Client tooling, especially in Go, has not. Most public agent code is Python and most public Python agent code optimizes for cloud APIs. A Go project that runs entirely on local inference, treats MCP as a first-class primitive, and is small enough to read end-to-end fills a real gap and makes the case for Plexara's architectural model along the way.
2. Goals and Non-Goals
Goals
corelibrary that other agents (current and future) can depend on without inheriting CLI or transport concerns.core: a single-shot CLI (ask), an interactive REPL (repl), and a starter for hosted deployments later.Non-Goals (v1)
coreand the v1 binaries do not assume a browser.3. Audience
Two readers, with different reading paths:
The agent builder. Someone who needs to build an MCP-driven agent for their own product. They read
core/top to bottom, treat the binaries incmd/as worked examples, and importcoreinto their own project.The Plexara evaluator. Someone evaluating whether Plexara's MCP design holds up under real agent traffic. They run
cmd/askagainst the ACME demo, look at the example workflows, and learn the platform by interaction.The spec, the layout, and the code should serve both.
4. Architecture Overview
The agent loop is a function: given a Provider, one or more MCP clients, and a user message, it produces a stream of events until the model signals completion. Everything else is composition around that function.
5. Module Layout
Rules:
corehas no dependencies oncmd/orexamples/.cmd/binaries are intentionally small (target ~150 lines each). If a binary grows logic worth keeping, it moves intocore.6. Core Abstractions
6.1 Events
Everything streamed out of the agent loop is an event. Events are a closed sum type, expressed as an interface with a sealing method.
Consumers (CLI, REPL, future web server) switch on event type and render accordingly.
6.2 Provider
Defined from the consumer side, narrow on purpose.
Why a channel and not an iterator: range-over-func iterators are stable and tempting, but channels compose better with
select, cancellation, and the existing context-cancellation idiom. An iterator wrapper can be added later if there's demand.For v1 there is one Provider implementation:
OpenAICompatible. Pointing it athttp://localhost:11434/v1gives Ollama. Pointing it at an MLX server, llama.cpp's server, or a future vLLM deployment is a config change. This is deliberate: a single well-tested adapter beats four half-tested ones.6.3 Session
Session is a value type, easy to serialize, easy to replay. Persistence is JSON Lines on disk by default. Replay (feeding a saved session back into the loop) is a first-class operation, useful for debugging and for evaluations.
6.4 MCP Client
Thin wrapper over
modelcontextprotocol/go-sdk. Responsibilities:Tool names sent to the model are namespaced as
server__tool(double underscore separator) to keep them legal under all provider tool-name regexes while remaining trivially parseable.6.5 Agent Loop
The single most important file in the project. It is short on purpose.
The loop does this and only this:
Router.Narrowif configured, to pick a relevant subset of tools for this turn.TextDelta: forward.ToolCallRequest: ask the Approver, then dispatch to MCP, append the result to the session, and re-stream from the Provider.Finish: forward and exit.Error: forward and exit.MaxStepsso a misbehaving model cannot tool-loop indefinitely.Cancellation, error wrapping, and structured logging happen here, not in providers.
6.6 Tool Router
The router is what makes a 30B local model tractable against a 30+ tool MCP surface. Without it, every turn carries the full catalog and small models drift.
Two implementations ship in v1:
PassThrough: returns the full catalog. Good for small MCP servers and large models.ToolkitClassifier: a lightweight first pass against the same Provider that asks "which toolkits are likely needed?" then returns only those toolkits' tools. Plexara'sdatahub_*,trino_*,s3_*, andmemory_*namespaces map cleanly to toolkits.The classifier prompt and toolkit definitions live in
docs/prompts/so they are reviewable and version-controlled rather than buried in code.6.7 Approval
Mutating tool calls (writes, deletes, expensive queries) require explicit human approval by default for interactive binaries. Single-shot binaries can be configured to auto-approve, deny, or prompt out-of-band.
Standard implementations:
AutoAllow: trust the MCP server's declared permissions.Interactive: TTY prompt, used bycmd/repl.Policy: rule-based, e.g. allow alldatahub_*reads, prompt ontrino_execute, deny ondatahub_delete.7. Local Model Strategy
7.1 Target Model
Primary: Qwen3-30B-A3B at Q4_K_M quantization.
Validated alternatives (
docs/models.mddocuments results):Models below 14B parameters are not recommended for this agent. Their tool-call reliability falls off on chains longer than two or three steps, which is exactly the workload Plexara generates.
7.2 Runtime
Default: Ollama. One install command, OpenAI-compatible at
localhost:11434/v1, ships on macOS, Linux, and Windows.Documented alternative: mlx-lm's OpenAI-compatible server on Apple Silicon for roughly 1.5x to 2x faster decode at equivalent quants.
Both are reached through the same
OpenAICompatibleprovider. Switching is a config change.7.3 What the agent does NOT do
/tokenizecall to the runtime when available) in v1.1.ollama pull qwen3:30b-a3b-q4_K_Mis a documented prerequisite, not a runtime concern.8. MCP Integration
8.1 Context Enrichment Is on the Server Side
Plexara MCP performs context enrichment as part of its protocol surface: tool descriptions carry domain context,
datahub_get_schemaanddatahub_get_glossary_termexist as first-class tools, lineage is queryable, and resources expose curated views over the catalog. The agent in this project does not implement enrichment. It consumes whatever the MCP server presents.This is the right division of responsibility. The MCP server has the catalog, the lineage graph, the access policies, and the domain knowledge. Putting enrichment logic in the agent would duplicate it and couple the agent to one MCP's data model.
What the agent does instead:
datahub_searchanddatahub_get_schemaare likely first calls when the user asks about data, before the model attempts atrino_query.The Plexara examples make this concrete.
examples/acme-revenue/shows a turn where the model issuesdatahub_searchthendatahub_get_schemabefore writing SQL, because the toolkit classifier surfaced both tools and Plexara's tool descriptions made their purpose obvious. No agent-side enrichment code is required for that to work.8.2 Tool Routing
See section 6.6. Tool routing is the agent's responsibility because it depends on the user message and the conversation, not on the MCP server's catalog alone. Once narrowed, the agent passes the selected tools (with their server-supplied descriptions intact) to the model.
8.3 SQL Safety Pattern
For Plexara and any MCP exposing SQL execution, the agent ships with an optional
SQLValidatormiddleware that:*_queryand*_executetool calls.*_explaintool first.This costs one extra round trip and dramatically reduces the rate at which local models produce broken queries. It is opt-in but enabled by default in the Plexara examples.
This is a client-side pattern, not enrichment. The MCP server already provides
_explain; the agent just orchestrates the two-step.8.4 Tool-Call Streaming Discipline
The agent never parses partial tool-call JSON. The
OpenAICompatibleprovider buffers tool-call deltas internally and emits aToolCallRequestevent only when the runtime signals the call is complete (finish_reason: tool_callsor equivalent). This is non-negotiable. Half-parsed tool calls are the single most common source of agent flakiness in the wild.9. CLI Surfaces
9.1
cmd/askSingle-shot. One question in, streamed answer out, exit.
Flags:
--model: model name passed to the runtime.--mcp: named MCP config from~/.config/plexara-agents/config.yaml, or a path.--no-router: disable toolkit narrowing, send the full catalog every turn.--auto-approve: bypass approval prompts (for scripted use).--session FILE: append to or replay a session file.--json: structured event output for piping into tooling.9.2
cmd/replInteractive. Multi-turn session with approval prompts, history, and slash commands.
Slash commands (planned):
/toolslist narrowed tools for the next turn/catalogshow full catalog from all connected MCPs/save FILE,/load FILEsession persistence/explainshow the last tool call's parameters and result/promptprint the assembled system prompt for the next turnImplementation uses
bubbleteafor the TUI layer. This pulls in a real dependency, butbubbleteais the de facto Go TUI library and the alternative (raw terminal handling) is not where reference-quality code should be spent.10. Configuration
YAML, located at
~/.config/plexara-agents/config.yamlby default, overridable with--config.Configuration values are also overridable via flags and environment variables. Precedence: flag > env > config file > built-in default. No silent overrides.
11. Observability
Structured logging via
log/slog, JSON by default, text in TTYs.Every tool call emits a log line with: server, tool name, argument digest (hashed, not raw), latency, success or error class. Resource fetches and enrichment calls do the same.
Optional OpenTelemetry tracing behind a build tag. Off by default to keep the dependency tree light. When on, the agent loop emits one span per turn with child spans for enrichment, provider streaming, and each tool call.
A
--debugflag dumps the full assembled system prompt, the narrowed tool list (with the MCP server's descriptions), and the messages sent to the Provider to stderr before each turn. This is the single most useful debugging affordance.12. Error Handling
Conventions:
fmt.Errorf("%w", ...)and never lose the original.mcp.ErrServerUnavailable,provider.ErrModelNotFound, etc.).event.Errorfor streaming consumers, but also returns them fromRunfor callers that want to handle errors imperatively.13. Testing
Three layers, plus fuzzing.
Unit tests. Each package, table-driven where it makes sense. The Provider interface has a
provider/testing.Fakeimplementation that lets the loop, router, and approval be tested without any network or model. All tests run with-raceand-shuffle=onin CI.Integration tests. Run against a real MCP server and a real local model. Gated behind a build tag (
//go:build integration) and anINTEGRATION=1env var. Run on a self-hosted CI runner with Ollama and Qwen3 pre-pulled, and locally during development. Never block the standard PR pipeline.Replay tests. Saved sessions in
testdata/sessions/are replayed against a recorded Provider transcript. This catches regressions in the loop, router, and tool dispatch without needing a live model or network. New examples must ship with at least one replay test; the spec is enforced by CI.Fuzz tests. Native Go fuzzing for parsers and serializers: tool-name namespacing, event JSON marshaling, MCP server response handling, session file decoding. Fuzz corpora committed under
testdata/fuzz/. CI runs short fuzz cycles per PR; longer scheduled runs catch regressions over time.Coverage is measured with
-covermode=atomicand uploaded to Codecov. Quality gates are defined in section 14.14. CI, Security, and Repository Standards
This is an OSS reference project. The CI surface, supply-chain posture, and repository hygiene are part of what's being demonstrated. Standards align with what matured projects in the same space ship (kubefwd's pipeline is the reference baseline).
14.1 Repository Hygiene
Files committed at the repo root or under
.github/:LICENSE(Apache 2.0).NOTICEif any third-party attribution is required.README.mdwith status badges (build, coverage, Go Report Card, Scorecard, license, latest release).CONTRIBUTING.mddescribing local dev setup, commit conventions, and the PR process.CODE_OF_CONDUCT.md(Contributor Covenant 2.1).SECURITY.mdwith a vulnerability disclosure policy, supported-versions matrix, and a security contact (security@plexara.ioor equivalent). GitHub Private Vulnerability Reporting enabled.CODEOWNERSmapping directories to maintainers..github/ISSUE_TEMPLATE/with bug, feature, and security-redirect templates..github/PULL_REQUEST_TEMPLATE.md..github/dependabot.ymlconfigured forgomod,github-actions, anddocker(if applicable).Branch protection on
main:14.2 Continuous Integration
Workflows under
.github/workflows/:ci.yml: build, vet, lint, test (race + coverage),go mod verify,go mod tidy -diff. Runs on every PR and every push tomain.security.yml: gosec, govulncheck, Semgrep. Runs on every PR and on a weekly schedule.codeql.yml: GitHub CodeQL with thesecurity-extendedandsecurity-and-qualityquery packs. Runs on every PR, every push tomain, and weekly.scorecard.yml: OpenSSF Scorecard. Runs weekly and onmainpushes; uploads SARIF to the security tab and publishes results.dependency-review.yml: blocks PRs that introduce dependencies with known vulnerabilities or non-permissive licenses.release.yml: triggered onv*.*.*tags; runs GoReleaser, generates SBOMs, signs artifacts, attaches SLSA provenance.fuzz.yml: runs Go fuzz targets for an extended cycle (e.g., 5 minutes per target) on a nightly schedule. Failures open issues automatically.All jobs run on the latest stable Ubuntu runner image. macOS jobs cover the Apple Silicon developer path for the Ollama/MLX integration tests.
14.3 Test and Coverage Gate
Standard test invocation in CI:
core/.... PRs that drop coverage below the threshold fail CI.cmd/...andexamples/...are exercised but excluded from the gate; they exist primarily as worked examples and integration scaffolding.-raceis mandatory in CI and recommended locally.-shuffle=onto surface order-dependent tests.testdata/and run as part of the standard suite.integrationbuild tag run only on the self-hosted runner.14.4 Build
go buildruns as a separate step before tests, on its own to fail fast on compile errors before the longer test suite kicks off. Build verification includes:go build ./...for every supported platform via a matrix (darwin/arm64,darwin/amd64,linux/amd64,linux/arm64).go vet ./....go mod verify.go mod tidy -diffto confirmgo.modandgo.sumare clean (no hidden drift).gofmt -l .to fail on unformatted files.-trimpath, ldflags-s -wonly on release binaries (debug builds keep symbols).CGO_ENABLED=0) for portability.14.5 Linting
golangci-lintwith a comprehensive, opinionated configuration in.golangci.yml. Enabled linters (in addition to the default set):errcheck,errorlint,errname,govet,ineffassign,staticcheck,unused,gosimple,gofmt,goimports,misspell,revive,gocritic,gocyclo,gocognit,gosec,prealloc,unconvert,unparam,copyloopvar,intrange,nilerr,nilnil,contextcheck,durationcheck,exhaustive,gomoddirectives,gomodguard,importas,predeclared,whitespace,godot,dupl,nolintlint.Specific rules:
gocyclocyclomatic complexity threshold 15.gocognitcognitive complexity threshold 20.duplset to flag genuine duplication, with a high enough threshold that table-driven test rows don't trip it.nolintlintenforces that every//nolint:directive includes a reason.gosecruns in lint mode here; a separategosecjob (section 14.6) runs with stricter settings.The lint job fails CI on any new finding. Existing legitimate exceptions are listed inline with
//nolint:<linter> // <reason>.14.6 Security Scanning
Multiple complementary scanners. They overlap intentionally; coverage gaps in one are filled by another.
gosec: dedicated job usingsecurego/gosecGitHub Action with full ruleset. Findings as SARIF uploaded to the security tab.govulncheck: officialgolang.org/x/vuln/cmd/govulncheckagainst./...on every CI run. Failures block the merge.returntocorp/semgrep-actionwith rulesetsp/security-audit,p/secrets,p/golang,p/owasp-top-ten. Findings posted as PR comments and uploaded as SARIF.github/codeql-actionwithgolanguage,security-extendedandsecurity-and-qualityquery packs.aquasecurity/trivy-actionfilesystem scan for misconfigurations and secrets. Container scan added when the project starts publishing images.All scanner outputs go through GitHub's SARIF interface so findings are visible in the security tab and surfaceable in PR review.
14.7 Supply Chain Security
ossf/scorecard-actionweekly. Target score >=8.0. Score badge in README.slsa-framework/slsa-github-generatorproduces signed provenance for every release artifact.anchore/sbom-action(Syft) in both CycloneDX and SPDX formats; attached to every GitHub release.SECURITY.md.-trimpath, fixed module cache, frozenBUILD_IDfrom the tag's commit. Documented procedure for third-party reproduction.google/go-licensesblocks PRs that introduce dependencies with non-permissive licenses (anything not in an allowlist of MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, MPL-2.0, etc.).14.8 Releases
Releases are tag-driven. Cutting
vX.Y.Ztriggersrelease.yml, which runs GoReleaser..goreleaser.yaml.darwin/arm64,darwin/amd64,linux/amd64,linux/arm64,windows/amd64..sigand a.cert.commitlintor equivalent.ghcr.io/plexara/plexara-agentswith cosign signature and SBOM.askandreplbinaries.14.9 Dependency Management
gomodandgithub-actions, weekly schedule, grouped updates for non-major bumps.govulncheckprovides the runtime safety net for transitive vulnerabilities Dependabot cannot reach.go.sumandgo mod verify.14.10 Action Pinning
Every third-party GitHub Action pinned to a full commit SHA, with the human-readable version as a trailing comment:
This is non-negotiable: SHA pinning is what closes the supply-chain attack surface that tag-pinning leaves open. Dependabot updates the SHAs on its weekly cadence; reviewers verify the new SHA points at the same release the comment claims.
First-party actions (
actions/*,github/*) follow the same rule. The ScorecardPinned-Dependenciescheck enforces this; missing pins fail Scorecard.14.11 Pre-commit and Local Tooling
.pre-commit-config.yamlwith hooks:trailing-whitespace,end-of-file-fixer,check-yaml,check-json,check-added-large-files,mixed-line-ending.maketargets that mirror CI:make build,make test,make lint,make sec,make cover,make tidy. Developers run these before pushing.tools.gofile underinternal/tools/pins versions of the dev-only tools (golangci-lint,goimports,govulncheck) so thatgo installproduces reproducible local toolchains.14.12 Frontend Build (where applicable)
For v1, no frontend exists. The
corelibrary is consumed by terminal binaries.When the hosted web variant lands (separate repository, see section 19), the standards inherited and enforced are:
"strict": trueandnoUncheckedIndexedAccess.@typescript-eslint/strict-type-checkedand security plugins.tsc --noEmittypecheck as a CI gate.npm audit,osv-scanner, and Dependabot.This section is documented now so the future repository is set up correctly from day one.
14.13 Badges
Badges in
README.md, in this order:15. Coding Standards
gofmt,goimports, andgolangci-lintclean. Lint config in.golangci.yml, conservative rule set.Providerinterface lives incore/providerbecause the loop, the router, and tests all consume it; defining it insidecore/loopwould be wrong. When in doubt, define small.context.Contextis the first parameter of any function that does I/O. No exceptions.event.Error, but does not recover from panics in the Provider or core. Those crash by design.internal/is used freely. If a package is not meant to be imported by downstream consumers, it goes there.Idiom we are deliberately avoiding
Functional options for the Provider and MCP client are appealing but cost readability. We use plain config structs with sensible zero values instead. Functional options are reserved for places where a long tail of optional knobs really exists (the agent loop's
Config).16. Dependencies
16.1 Runtime
A small list, intentionally:
github.com/modelcontextprotocol/go-sdkMCP client (official, jointly maintained by Anthropic and Google).github.com/charmbracelet/bubbleteaREPL TUI (only pulled in bycmd/repl, not bycore).gopkg.in/yaml.v3config parsing.golang.org/x/sync/errgroupconcurrent MCP server connection management.That's it for
core. The OpenTelemetry path adds dependencies behind a build tag.No HTTP framework. The standard library is fine for the OpenAI-compatible client and any future server.
No agent framework. We are the agent framework.
16.2 Developer and CI Tooling
Pinned via
internal/tools/tools.go(the standard//go:build toolspattern) sogo installproduces reproducible local toolchains:github.com/golangci/golangci-lint/cmd/golangci-lintgolang.org/x/tools/cmd/goimportsgolang.org/x/vuln/cmd/govulncheckgithub.com/securego/gosec/v2/cmd/gosecgithub.com/google/go-licensesgithub.com/anchore/syft/cmd/syft(SBOM, used by GoReleaser)github.com/sigstore/cosign/v2/cmd/cosign(signing, used by release workflow)CI installs these from the same pinned versions. Local
make toolsproduces an identical toolchain.17. Plexara Demo Workflows
examples/acme-revenue/is the headline. It demonstrates:datahub_*andtrino_*.datahub_searchanddatahub_get_schemato ground itself before writing SQL.trino_explain(via the SQL safety wrapper), thentrino_query.The point of this example is to show that an unmodified, generic agent driving a richly enriched MCP produces good results. The MCP earns its keep; the agent stays small.
examples/acme-lineage/demonstrates the lineage walk: a question about downstream impact triggersdatahub_get_lineagecalls, the model assembles a small graph in its response, and Plexara's glossary tools (datahub_get_glossary_term) come into play when terms need defining. Again, the agent does not coordinate this; the model drives it because Plexara's tool descriptions make the path obvious.examples/multi-mcp/connects to two MCP servers at once (Plexara ACME and a second small server, possibly Filesystem from the official examples) and demonstrates that namespacing and routing work cleanly across servers.Each example has a README, a
main.gothat's small enough to read in one sitting, and a recorded session for replay testing.18. Roadmap
v0.1 (initial public release)
Code:
core/event,core/provider/openai_compatible,core/mcp,core/session,core/loop,core/router/{passthrough,toolkit_classifier},core/approval.cmd/ask.examples/acme-revenuewith replay test.Documentation:
openai_compatiblein v1).CONTRIBUTING.md,CODE_OF_CONDUCT.md,SECURITY.md,CODEOWNERS, issue and PR templates.CI and supply chain (the entire section 14 baseline):
ci.yml,security.yml,codeql.yml,scorecard.yml,dependency-review.yml,release.yml,fuzz.yml.core/..., Codecov upload.gomodandgithub-actions.mainenforced.v0.2
cmd/replwith TUI and slash commands.examples/acme-lineage,examples/multi-mcp.v0.3
/tokenize.v1.0
coreAPI.Post-1.0
A separate repository (
plexara-agents-serveror similar) builds a multi-user web service on top ofcore. That is a different project with a different deployment story; this one stays single-user, local-first, terminal-native.19. Future: Hosted Multi-User
Out of scope for this repository, but worth recording the shape so
coredoes not paint into a corner.When the hosted variant is built,
corewill be imported unchanged. The differences are at the edges:loop.Runcall per HTTP session, with sessions persisted to Postgres or similar.mcp.Clientis built from their authorized server list.If the v1
coreAPI supports all of that without modification, we did our job.20. Open Questions
These are deliberately unresolved and should be settled before implementation begins.
core, or should each MCP user supply their own? Leaning toward "shipped generic, override per MCP."server__tool) is safe but ugly. Other MCP clients have used colon and dot. The choice locks in: do we want to align with any community-emerging convention? Worth surveying before locking in.21. References
modelcontextprotocol/go-sdk: https://github.com/modelcontextprotocol/go-sdktxn2/mcp-data-platform: the Plexara MCP server reference implementation.ml-explore/mlx-examplesrepository.