feat: accept seed_uri via host-resources SDK (phase 2c)#8
Merged
Conversation
Add an optional `seed_uri` parameter to `start_research`. When set, the server resolves the URI via the `ai.nimblebrain/host-resources` extension (using `nimblebrain-bundle-sdk`'s `host(ctx).read()`) and prepends the file content to the research prompt. Closes the original production bug — synapse-research can now anchor research on a workspace file (brokers.csv, transcript, doc) the agent points it at, instead of either receiving content inlined in the query (large + ugly) or doing web research from scratch (the bug we hit). ## Behaviour - `seed_uri="files://fl_abc"`: server reads the file, prepends content under a `## Workspace seed` header in the prompt, records `seed_content_chars` on the `research_run` entity. - Seeds above 400 KB are truncated with a visible marker so the model and the human reader both see the cut happened. - Binary resources (no `text` field) are refused with a clear error telling the agent to extract text upstream. - Hosts that don't advertise the host-resources extension surface a `HostCapabilityMissing` error naming the capability — the agent knows to retry with content inline via its own file-reading tool (the Level-C fallback pattern from the host-resources design). ## SDK source Pins `nimblebrain-bundle-sdk>=0.1.0`. Until the SDK lands on PyPI, the `[tool.uv.sources]` block sources from the local path at `../../products/nimblebrain/code/packages/bundle-sdk-py/` (relative to this pyproject.toml — requires the `hq` meta-repo cloned alongside synapse-research, the standard NimbleBrain dev layout). Drop that block once the package is on PyPI. **Ordering:** this PR depends on `NimbleBrainInc/nimblebrain#268` (the SDK package) merging first. Otherwise the local path doesn't exist and `uv sync` fails. ## Tests 48 passing (43 existing + 5 new). The new file covers: - happy path: seed content reaches the worker's gpt-researcher prompt - entity records `seed_content_chars` correctly - Level-C: `HostCapabilityMissing` surfaces with the capability name - binary refusal: clear error directing the agent to extract upstream - backward-compat: omitting `seed_uri` is a no-op - SDK importability smoke Tests mock at the `host()` factory boundary because the Python `mcp.client.ClientSession` rejects custom-method server→client requests (its `ServerRequest` union is closed). Wire-shape validation lives in the SDK's own unit tests and the platform's TS-side handler tests, both of which already exist.
synapse-research CI clones only its own repo — the local-path `tool.uv.sources` (`../../products/nimblebrain/code/packages/bundle-sdk-py`) resolves on developer machines with the `hq` meta-repo cloned alongside, but breaks in CI where the parent path doesn't exist. Switch to a git source pinned to a specific commit on the SDK PR branch (`nimblebrain@89cd28f`). Reproducible builds, works in CI without a meta-repo clone, and bumps explicitly when the SDK ships a new pre-1.0 revision. Drop this block once `nimblebrain-bundle-sdk` is on PyPI; the version pin in `[project.dependencies]` resolves through PyPI from then on. Until then, contributors editing the SDK locally should `uv pip install -e <path>` against their checkout.
The previous version added `seed_uri` (host-resources-extension read) but left a real gap: hosts without the extension had nowhere to put seed content. The error message even told the agent to "pass content inline via `seed_data`" — except `seed_data` didn't exist as a parameter. Fixing that. ## Changes `start_research` now takes both `seed_uri` and `seed_data`, mutually exclusive: - `seed_uri`: host reads via `ai.nimblebrain/host-resources`. Preferred when available — the agent doesn't pay context budget on the file bytes. - `seed_data`: raw text passed inline by the agent. Universal fallback that works on every host. Required for hosts without the extension. When `seed_uri` is set on a host that doesn't advertise the extension, the tool now returns a `ValueError` whose message names both the missing capability AND the specific retry shape (`seed_data=<file contents>`). The previous error pointed at a non-existent parameter — a Level-C signal that wasn't actionable. Passing both `seed_uri` and `seed_data` is rejected as ambiguous rather than picking one; the agent probably confused the two paths. ## Tests 8 passing in tests/test_seed_uri.py (+3 new): - `seed_data` inline reaches the worker prompt - `seed_uri` + `seed_data` together → mutex error - Capability-missing error tells the agent to retry with `seed_data` specifically (not just "pass content inline") All 51 tests (43 existing + 8 in test_seed_uri.py) pass.
The platform's host-manifest schema constrains `briefing.priority` to `["high", "medium", "low"]` (host-manifest.schema.json:153). The existing `"normal"` value was never valid against this enum — it worked previously because the platform didn't enforce the schema strictly. Recent installs fail with: Bundle "@nimblebraininc/synapse-research" has an invalid _meta["ai.nimblebrain/host"] block: ai.nimblebrain/host/briefing/priority: must be equal to one of the allowed values. Refusing to install. Switching to `"medium"` — it's the middle bucket, matching the prior intent (default-ish priority for this app's briefing facets). Unrelated to the Phase 2c work in this PR; folded in because it blocks local testing of any synapse-research install against current platform main.
This was referenced May 22, 2026
PR NimbleBrainInc/nimblebrain#268 merged as 852cbdd. The previous git ref pointed at the SDK PR branch (89cd28f) which was deleted on merge — the ref still resolves via github.com's commit-keyed fetch, but tracking a deleted branch's tip is confusing. Bump to the squash-merge commit on main. Still git-sourced, not PyPI; the actual `bundle-sdk-py/v0.1.0` PyPI publish requires the one-time Trusted Publisher setup. Once that lands, drop the `[tool.uv.sources]` block entirely and let the version pin in `[project.dependencies]` resolve through PyPI.
nimblebrain-bundle-sdk v0.1.0 is now on PyPI: https://pypi.org/project/nimblebrain-bundle-sdk/0.1.0/ Drop the `[tool.uv.sources]` git-source override. The `>=0.1.0` pin in `[project.dependencies]` now resolves through PyPI, which means fresh clones (devs, CI runners) install the SDK without needing the `hq` meta-repo cloned alongside or a specific git commit fetched. uv.lock updated to reflect the registry source. 51 tests still green.
…d 13 on #8) Three substantive fixes from QA review: ## Truncation test (Critical #4) `_SEED_MAX_CHARS = 400_000` was implemented without a test exercising the slicing path or the marker shape. Added `test_seed_truncation_emits_marker` that constructs a seed >cap with sentinel head/tail strings, asserts the head survives the cut, the tail is dropped, and the marker carries both the cap and the actual length in its formatted-number form. Future refactors that touch the cap or the marker text now fail loudly. ## Empty-text resource distinct from binary (Suggestion #1) `if not text:` matched both `None` (binary) and `""` (empty text). A legitimately empty workspace file was reported as binary with a misleading "extract text upstream" recovery hint. Tightened to `if text is None:` for the binary branch and added a dedicated empty-text branch whose error tells the agent the file is actually empty (verify contents, or omit `seed_uri`). New test pins this distinction. ## Scheme probe before read (Suggestion #3) `_resolve_seed_uri` now calls `h.supports_scheme("files")` after the availability check. The platform would otherwise return `-32602 Invalid params` for an unsupported scheme, which the agent sees as a generic wire error. Routing it through the same Level-C retry hint (pass `seed_data` inline) gives the agent an actionable recovery path. Not load-bearing today (the platform always supports `files://`), but cheap insurance and exercises the SDK's `supports_scheme()` surface. ## Tests 54 passing (43 existing + 11 in test_seed_uri.py, +3 new): - test_seed_truncation_emits_marker - test_seed_uri_empty_text_distinct_from_binary - test_seed_uri_scheme_not_supported All existing fixtures (`seeded_host`, `unavailable_host`, `binary_host`) grew a `supports_scheme()` method to match the new SDK probe. Lint + format clean on src/ and tests/.
QA reviewer's worktree at `.claude/worktrees/feat-host-resources-sdk-adoption` got staged as an embedded repo in the previous commit because `.claude/` wasn't in `.gitignore`. Untrack the pointer + add the directory to gitignore so future QA worktrees don't reintroduce it.
Commit 9a99def corrupted .gitignore by appending `.claude/` via `echo >>` to a file whose last line had no trailing newline. The concatenation produced: .tasks/.claude/ which broke BOTH intended behaviours: - `.tasks/foo` was no longer ignored (`/implement` scratch leaked) - `.claude/worktrees/foo` was never ignored (QA worktree submodule pointers can still be re-staged — the bug 9a99def was meant to fix) The only path that was newly ignored — `.tasks/.claude/` — doesn't exist in this repo. Split into two distinct lines with proper newline terminators. Verified with `git check-ignore -v`: .gitignore:28:.tasks/ .tasks/foo .gitignore:31:.claude/ .claude/worktrees/foo
mgoldsborough
added a commit
that referenced
this pull request
May 23, 2026
Minor bump for the seed_uri + seed_data feature add merged in #8 (Phase 2c host-resources SDK adoption). Cuts the first synapse-research release that can anchor research on a workspace file. See the GitHub release notes on v0.3.0 for user-visible changes; this commit only touches manifest.json / pyproject.toml / __init__.py via `make bump`. Co-authored-by: Mathew Goldsborough <1759329+mgoldsborough@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase 2c of the host-resources roadmap. Phase 1 (nimblebrain#262) advertised the capability; Phase 2a (nimblebrain#263) wired the platform-side handlers; Phase 2b (nimblebrain#268, now `nimblebrain-bundle-sdk` v0.1.0 on PyPI) shipped the Python SDK. This PR is the first real adoption — closes the original production bug where synapse-research couldn't anchor on a workspace file the agent pointed it at.
Behaviour change
`start_research` grows two new optional parameters, mutually exclusive — together they cover every host/agent combination:
Agents decide between the two by checking the host's advertised capabilities. When `seed_uri` is passed to a host that doesn't support host-resources (or doesn't support the `files://` scheme), the tool returns a structured error naming both the missing capability and the specific retry shape (`seed_data=`) — actionable for the agent without trial-and-error.
Passing both `seed_uri` and `seed_data` together is rejected as ambiguous rather than silently picking one.
Other behaviours:
SDK source
Pins `nimblebrain-bundle-sdk>=0.1.0` and resolves it from PyPI (no `tool.uv.sources` override anymore). Earlier revisions of this PR sourced via git ref while #268 was unmerged; that's gone — clean PyPI resolution as of `f7ca866`.
No merge-ordering dependency remains — the SDK is canonical on PyPI, so this PR is independently mergeable.
Tests
`uv run pytest tests/` → 54 passing (43 existing + 11 in `tests/test_seed_uri.py`).
Notable coverage in the new file:
Tests mock at the `host()` factory boundary because the Python `mcp.client.ClientSession` rejects custom-method server→client requests (its `ServerRequest` union is closed). Wire-shape validation lives in the SDK's own unit tests + the platform's TS-side handler tests, both of which already exist.
Drive-by fix
Commit `b77f483` is a separate, narrowly-scoped fix for the `briefing.priority` manifest value (`"normal"` was never valid per the host schema enum `["high","medium","low"]`; blocked local install on current platform main). Kept as its own commit for review isolation — happy to split into a separate PR if reviewer prefers.
Test plan