Skip to content

Update MCP instructions: auto-generated tool tables#4596

Open
TerrenceMcGuinness-NOAA wants to merge 1 commit intoNOAA-EMC:developfrom
TerrenceMcGuinness-NOAA:update_eib-mcp-tool_instuctionsfile
Open

Update MCP instructions: auto-generated tool tables#4596
TerrenceMcGuinness-NOAA wants to merge 1 commit intoNOAA-EMC:developfrom
TerrenceMcGuinness-NOAA:update_eib-mcp-tool_instuctionsfile

Conversation

@TerrenceMcGuinness-NOAA
Copy link
Collaborator

Update MCP Tool Instructions for EIB Gateway

Summary

Updates .github/instructions/mcp.instructions.md to reflect the current EIB MCP server (v3.6.2, 48 tools) and align with the Docker MCP Gateway architecture. Removes the SDD Workflow tools (internal to the MCP-RAG platform, not relevant for global-workflow consumers), bringing the exposed tool count to 39 tools across 8 modules. Adds detailed Required/Optional argument columns to facilitate GraphRAG tool usage and removes inappropriate use cases (SDD session management).

Changes

Change Before After
applyWhen hasActiveMCPServer("eib-mcp-rag-full") hasActiveMCPServer("eib-mcp-gateway")
excludeAgent: "code-review" Present Removed — redundant with applyWhen gate
Tool reference Hand-written 2-column tables (48 tools) Auto-generated 4-column tables from generate-tool-docs.js (39 tools)
SDD Workflows (9 tools) Included Removed — platform-internal, not for GW consumers
Tool format Tool | Use For Tool | Required | Optional | Description
Argument details Missing — users had to guess required vs optional params Added — explicit Required/Optional columns for all 39 tools, sourced from runtime schemas
GraphRAG tools Listed with only a "Use For" blurb Enhanced — full argument signatures showing symbol, query, code_or_symbol, from_symbol, file_path, etc. with optional params like depth, token_budget, include_community
Regeneration Manual Comment: cd mcp_server_node && node scripts/generate-tool-docs.js

Why Argument Details Matter

The previous format only listed tool names and a brief description:

| `get_code_context` | GGSR neighborhood + community summary |

An agent seeing this has to guess what arguments to pass. The new format makes it explicit:

| `get_code_context` | `symbol` | `depth`, `include_community`, `token_budget` | Get comprehensive context for a code symbol including graph neighborhood and community summaries |

This is especially important for the GraphRAG module (9 tools) where tools like trace_data_flow, get_change_impact, and find_similar_code have distinct required parameters (from_symbol vs. symbol vs. code_or_symbol) that an agent would otherwise conflate.

Removed: SDD Workflow Tools

The 9 SDD tools (list_sdd_workflows, start_sdd_session, record_sdd_step, etc.) manage the Spec-Driven Development session lifecycle on the MCP-RAG platform itself. They are not appropriate for agents working on global-workflow code — an agent reviewing a forecast script should not be starting SDD sessions. These tools remain available to agents working directly on the MCP-RAG server repository via its own instructions file.

Context Window Performance Analysis

An analysis of the impact these instruction files have on LLM context window utilization in various GitHub Copilot scenarios:

File Sizes

File Lines Chars Est. Tokens
copilot-instructions.md 430 15,875 ~3,000
instructions/mcp.instructions.md 118 8,463 ~1,800
Combined 548 24,338 ~4,800

Impact Across Copilot Scenarios

Scenario Loads copilot-instructions.md Loads mcp.instructions.md Total Instruction Tokens % of 128K Window
PR Code Review (GitHub web) Yes (~3K) No — no MCP server ~3,000 2.3%
Coding Agent (assigned issues) Yes (~3K) Yes, if EIB gateway configured (~1.8K) ~4,800 3.7%
VS Code Chat (local dev) Yes (~3K) Yes, when gateway connected (~1.8K) ~4,800 3.7%

At 2.3–3.7% of context window, both files are well within the comfortable zone (general guidance: keep repo instructions under ~6%).

Known Redundancy with MCP Tool Schema Auto-Injection

When an MCP server connects, the MCP protocol itself injects JSON schemas for every registered tool into the model's context (~130 tokens per tool × 39 tools ≈ ~5,000 tokens). Our instruction file's parameter tables partially overlap with these auto-injected schemas (~1,100 tokens of the 1,800 total).

Unique value provided by the instruction file (~700 tokens):

  • Module grouping (schemas arrive flat; our categories guide tool selection)
  • MCP-first policy (when to use MCP tools vs. read_file/grep)
  • EE2 set -eu false-positive warning (domain-specific knowledge)
  • RAG Knowledge Base Tiers (what's in the vector store)
  • Backend annotations (Neo4j, ChromaDB, Filesystem)

This redundancy is acceptable and intentional for now. The detailed parameter tables serve a dual purpose: human readability for developers reviewing the file, and reinforcement for the model. The argument details are particularly valuable for the GraphRAG tools where parameter naming is non-obvious (symbol vs. from_symbol vs. code_or_symbol). When we move to the .github/agents/ custom agent architecture with a persistent MCP gateway (replacing the current dev tunnel proof-of-concept), we can optimize by stripping the parameter tables down to a compact module-only listing, saving ~1,100 tokens.

Future: Custom Agent Architecture

The planned next step is to create .github/agents/eib-rag.agent.md — a custom agent profile that:

  • Bundles the MCP server configuration directly in the agent's YAML frontmatter
  • Provides a focused prompt with only the unique guidance (~700 tokens)
  • Whitelists specific read-only EIB tools for autonomous use
  • Appears as a selectable agent alongside the default coding agent

This will require a stable, always-on MCP gateway URL (replacing the current dev tunnel), which is planned for the AWS Bedrock migration. The current instruction file format works well for both VS Code local development and as groundwork for that future agent configuration. The context window redundancy optimization is not germane at this time — it becomes relevant only when we configure the persistent gateway and the .github/agents/ custom agent, at which point the auto-injected schemas will make the parameter tables unnecessary.

Testing

  • Verified file loads correctly in VS Code when eib-mcp-gateway MCP server is active
  • Confirmed applyWhen gate prevents loading when no EIB MCP server is connected
  • Tool counts validated against node scripts/generate-tool-docs.js output (39 tools / 8 modules)

…n, remove SDD tools

- Replace manual tool reference with auto-generated 39-tool tables from generate-tool-docs.js
- Change applyWhen from eib-mcp-rag-full to eib-mcp-gateway
- Remove excludeAgent qualifier
- Remove SDD Workflow tools (9 tools) — not relevant for global-workflow consumers
- Add Required/Optional/Description columns for all tool parameters
- Add regeneration comment for maintainability
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the MCP (Model Context Protocol) instructions file to reflect the current EIB MCP Gateway architecture and tools. The changes align the documentation with the v3.6.2 server implementation, improve tool discoverability by adding explicit parameter specifications, and remove platform-internal SDD Workflow tools that aren't relevant for global-workflow developers.

Changes:

  • Updated MCP server reference from eib-mcp-rag-full to eib-mcp-gateway in YAML frontmatter and removed redundant excludeAgent line
  • Enhanced tool documentation from 2-column format to 4-column format with explicit Required/Optional parameter columns for all 39 tools
  • Removed 9 SDD Workflow tools (platform-internal session management) and reduced from 48 tools/9 modules to 39 tools/8 modules

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants