diff --git a/.github/instructions/mcp.instructions.md b/.github/instructions/mcp.instructions.md index 66a241a418..064fe59fde 100644 --- a/.github/instructions/mcp.instructions.md +++ b/.github/instructions/mcp.instructions.md @@ -1,104 +1,98 @@ --- -applyWhen: hasActiveMCPServer("eib-mcp-rag-full") -excludeAgent: "code-review" +applyWhen: hasActiveMCPServer("eib-mcp-gateway") --- -_Note:_ The YAML front matter in this file uses the GitHub Copilot instructions schema; `excludeAgent: "code-review"` ensures these instructions are not applied to the Copilot Code Review agent. See the official schema documentation at https://aka.ms/github-copilot-instructions-schema. + # EIB MCP/RAG Server — Tool Guide for Global Workflow (v7.21.0) -This file loads **only** when the EIB MCP-RAG server is connected. It provides tool selection guidance for AI agents working on global-workflow with MCP + RAG capabilities (48 tools across 9 modules backed by Neo4j graph DB and ChromaDB vector store). +This file loads **only** when the EIB MCP-RAG gateway is connected. It provides tool selection guidance for AI agents working on global-workflow with MCP + RAG capabilities (39 tools across 8 modules backed by Neo4j graph DB and ChromaDB vector store). ## MCP-First Policy **Prefer MCP tools over shell commands** for code analysis, documentation search, and compliance checking. Use `read_file`/`grep_search` only for exact line-level reads or literal string searches. -## Tool Modules (48 tools / 9 modules) +## Tool Modules (39 tools / 8 modules) -### 1. Workflow Info (3 tools — Filesystem only) -| Tool | Use For | -|------|---------| -| `get_workflow_structure` | System architecture overview | -| `get_system_configs` | HPC platform configurations | -| `describe_component` | Component documentation | +### 1. Workflow Info (3 tools — Filesystem) + +| Tool | Required | Optional | Description | +|------|----------|----------|-------------| +| `get_workflow_structure` | — | `component`, `structure_data` | Get the structure and overview of the global workflow system | +| `get_system_configs` | — | `platform`, `config_type`, `content` | Get system configuration information for different HPC platforms | +| `describe_component` | `component` | `show_content`, `content`, `file_type` | Get basic description of a workflow component (file system only) | ### 2. Code Analysis (6 tools — Neo4j) -| Tool | Use For | -|------|---------| -| `analyze_code_structure` | AST-level file/function analysis | -| `find_dependencies` | Import graph (upstream + downstream) | -| `trace_execution_path` | Call chain tracing | -| `find_callers_callees` | Fan-in/fan-out with complexity scoring | -| `trace_full_execution_chain` | End-to-end execution chain across files | -| `find_env_dependencies` | Environment variable lineage | + +| Tool | Required | Optional | Description | +|------|----------|----------|-------------| +| `analyze_code_structure` | `file_path` | `include_dependencies`, `depth`, `token_budget` | Analyze code structure, relationships, and dependencies for a specific file | +| `find_dependencies` | `target` | `direction`, `max_depth`, `token_budget` | Find all dependencies (imports) and dependents (importers) for a file or module | +| `trace_execution_path` | `function_name` | `file_path`, `max_depth`, `include_callers`, `include_weights`, `token_budget` | Trace the execution path from a starting function through call chains | +| `find_callers_callees` | `function_name` | `file_path`, `include_source`, `token_budget`, `cross_language` | Find all functions that call a target function and functions it calls | +| `trace_full_execution_chain` | `start` | `direction`, `max_depth`, `languages` | Trace complete execution chain across Shell, Python, and Fortran language boundaries | +| `find_env_dependencies` | `variable_name` | `show_exports`, `limit`, `token_budget` | Find all scripts that depend on or export a specific environment variable | ### 3. Semantic Search (6 tools — ChromaDB + Neo4j) -| Tool | Use For | -|------|---------| -| `search_documentation` | Semantic search across ingested docs | -| `find_related_files` | Vector similarity for related code/docs | -| `explain_with_context` | RAG-powered explanations with citations | -| `get_knowledge_base_status` | DB health and collection stats | -| `list_ingested_urls` | Documentation sources ingested into RAG | -| `get_ingested_urls_array` | Structured URL array for programmatic access | + +| Tool | Required | Optional | Description | +|------|----------|----------|-------------| +| `search_documentation` | `query` | `collection`, `max_results`, `include_graph`, `similarity_threshold` | Hybrid semantic + graph search across workflow documentation and code | +| `find_related_files` | `file_path` | `max_results`, `include_documentation` | Find files with similar dependencies and import relationships | +| `explain_with_context` | `topic` | `context_type`, `detail_level` | Provide comprehensive explanations using hybrid search | +| `get_knowledge_base_status` | — | `include_graph`, `include_vector` | Get comprehensive knowledge base statistics | +| `list_ingested_urls` | — | `format`, `source_filter` | List all URLs that have been ingested into the RAG knowledge base | +| `get_ingested_urls_array` | — | `include_failed` | Get a structured array of all ingested URLs for programmatic access | ### 4. EE2 Compliance (5 tools — ChromaDB) -| Tool | Use For | -|------|---------| -| `search_ee2_standards` | Search EE2 standards documentation | -| `analyze_ee2_compliance` | Check file against NCO standards | -| `generate_compliance_report` | Formatted compliance report | -| `scan_repository_compliance` | Bulk repo scan (Phase 2 SME-corrected) | -| `extract_code_for_analysis` | Extract code snippets for LLM analysis | + +| Tool | Required | Optional | Description | +|------|----------|----------|-------------| +| `search_ee2_standards` | `query` | `category`, `max_results`, `include_examples` | Search EE2 compliance standards and documentation | +| `analyze_ee2_compliance` | `content` | `analysis_type`, `include_recommendations` | Analyze code or documentation for EE2 compliance | +| `generate_compliance_report` | — | `scope`, `categories`, `format` | Generate comprehensive EE2 compliance report | +| `scan_repository_compliance` | `name`, `content` | `files`, `path`, `repository_path`, `file_patterns`, `sample_size`, `categories` | Scan repository for EE2 compliance (Phase 2 SME-corrected) | +| `extract_code_for_analysis` | `name`, `content` | `files`, `path`, `content_type`, `categories`, `file_pattern`, `max_files` | Extract code snippets from files for EE2 compliance analysis | **Note**: `set -eu` is NOT required (80% false positive). Uses `err_chk`/`err_exit` utilities. ### 5. Operational (4 tools — ChromaDB) -| Tool | Use For | -|------|---------| -| `get_operational_guidance` | HPC platform-specific procedures | -| `explain_workflow_component` | Graph-enriched component explanations | -| `list_job_scripts` | Categorized job script inventory | -| `get_job_details` | Detailed job script analysis | - -### 6. GraphRAG + Session State (9 tools — ChromaDB + Neo4j) -| Tool | Required Param | Use For | -|------|----------------|--------| -| `get_code_context` | `symbol` | GGSR neighborhood + community summary | -| `search_architecture` | `query` | Semantic search over community summaries | -| `find_similar_code` | `code_or_symbol` | Vector similarity + graph enrichment | -| `get_change_impact` | `symbol` | Blast radius with risk scoring | -| `trace_data_flow` | `from_symbol` | Data flow across codebase | -| `mark_as_modified` | `file_path` | Track file modifications in active session | -| `get_session_context` | *(none)* | Aggregated view of session work | -| `checkpoint_state` | `name` | Snapshot session state for recovery | -| `restore_checkpoint` | `checkpoint_id` | Roll back to a named checkpoint | + +| Tool | Required | Optional | Description | +|------|----------|----------|-------------| +| `get_operational_guidance` | `operation` | `platform`, `urgency` | Get operational guidance and best practices for HPC operations | +| `explain_workflow_component` | `component` | `detail_level` | Get detailed explanation of a workflow component with graph context | +| `list_job_scripts` | — | `category`, `search`, `format`, `job_list`, `files`, `name`, `content` | List and categorize job scripts in the workflow | +| `get_job_details` | `job_name` | `include_content`, `include_config`, `include_chromadb` | Get comprehensive details about a J-Job including inputs, outputs, dependencies | + +### 6. GraphRAG (9 tools — ChromaDB + Neo4j) + +| Tool | Required | Optional | Description | +|------|----------|----------|-------------| +| `get_code_context` | `symbol` | `depth`, `include_community`, `token_budget` | Get comprehensive context for a code symbol including graph neighborhood and community summaries | +| `search_architecture` | `query` | `max_results` | Search the codebase architecture for high-level understanding via community summaries | +| `find_similar_code` | `code_or_symbol` | `similarity_threshold`, `max_results` | Find code patterns semantically similar to a given symbol or description | +| `get_change_impact` | `symbol` | `change_type`, `include_indirect` | Analyze the blast radius of changing a code symbol with risk scoring | +| `trace_data_flow` | `from_symbol` | `to_symbol`, `max_depth` | Trace execution flow from a source symbol through the codebase | +| `mark_as_modified` | `file_path` | `change_type`, `description` | Record a file modification in the active session | +| `get_session_context` | — | `include_dirty` | Get aggregated view of the active session: examined symbols, file modifications | +| `checkpoint_state` | `name` | `description` | Snapshot current session state to a checkpoint for recovery | +| `restore_checkpoint` | `checkpoint_id` | — | Roll back session state to a previously created checkpoint | ### 7. GitHub Integration (4 tools — GitHub API) -| Tool | Use For | -|------|---------| -| `search_issues` | Search NOAA-EMC GitHub issues | -| `get_pull_requests` | PR information with diff context | -| `analyze_workflow_dependencies` | Cross-repo dependency analysis | -| `analyze_repository_structure` | Multi-repo structure comparison | - -### 8. SDD Workflows (9 tools — Filesystem) -| Tool | Use For | -|------|---------| -| `list_sdd_workflows` | List all workflow phase specs | -| `get_sdd_workflow` | Get specific phase details | -| `start_sdd_session` | Start a tracked session | -| `record_sdd_step` | Record step completion | -| `get_sdd_session` | Resume active session | -| `complete_sdd_session` | Complete and archive session | -| `get_sdd_execution_history` | View execution history | -| `validate_sdd_compliance` | Validate against SDD framework | -| `get_sdd_framework_status` | Framework status and metrics | - -### 9. Utility (2 tools — Built-in) -| Tool | Use For | -|------|---------| -| `get_server_info` | Server version, tool counts | -| `mcp_health_check` | Full health validation | + +| Tool | Required | Optional | Description | +|------|----------|----------|-------------| +| `search_issues` | `query` | `repository`, `state`, `labels` | Search GitHub issues across workflow repositories | +| `get_pull_requests` | — | `repository`, `state`, `limit` | Get pull request information and changes | +| `analyze_workflow_dependencies` | `component` | `analysis_type`, `include_external` | Analyze dependencies and relationships between workflow components | +| `analyze_repository_structure` | — | `repositories`, `analysis_depth` | Analyze structure and components across multiple repositories | + +### 8. Utility (2 tools — Built-in) + +| Tool | Required | Optional | Description | +|------|----------|----------|-------------| +| `get_server_info` | — | `include_capabilities` | Get information about the MCP server and available tools | +| `mcp_health_check` | — | `detailed`, `deep`, `functional` | Check the health status of all MCP server components | ## When to Use MCP vs Direct Access