Skip to content

Conversation

@tgrunnagle
Copy link
Collaborator

@tgrunnagle tgrunnagle commented Dec 10, 2025

Add a new GitHub Actions workflow that uses mcp-tef CLI (mtef) to run query alignment tests against mcp-optimizer.

Changes

  • Add .github/workflows/mcp-tef-integration-tests.yml workflow that:
    • Triggers on PR creation and updates
    • Installs mcp-tef CLI via uv tool install from the mcp-tef git repository
    • Deploys time and fetch MCP servers via ToolHive
    • Deploys mcp-optimizer configured to use the backend servers
    • Deploys mcp-tef server for LLM-based tool alignment testing
    • Creates a test case with prompt "Tell me the current time in Tokyo" expecting find_tool to be called
    • Executes the test run and polls for completion
    • Uploads test results as workflow artifacts (30 day retention)
    • Reports test status with warnings (non-blocking)

Configuration

  • Uses continue-on-error: true so test failures warn but don't block PR merges
  • Requires OPENROUTER_API_KEY secret for LLM access
  • Uses anthropic/claude-sonnet-4.5 model via OpenRouter

@claude
Copy link

claude bot commented Dec 10, 2025

Review Summary

Overall this is a well-structured integration test workflow. The code is clear and follows good GitHub Actions practices. Below are specific items to address:

Issues to Fix

1. Security: Hardcoded Docker host IP (172.17.0.1)

  • Line 65: Uses 172.17.0.1 which is Docker's default bridge but not guaranteed
  • Better: Use host.docker.internal or Docker networks for cross-container communication
  • Inconsistency: Line 170 uses host.docker.internal, line 65 should too

2. Error handling: set +e too early

  • Line 165: Disables error exit for entire test execution block
  • Problem: Silent failures between commands won't be caught
  • Better: Use set +e only around specific commands that should not fail the step

3. Redundant operation

  • Lines 147-150: Retrieves mcp-optimizer URL twice (also done in step at line 146)
  • Remove duplication by using steps.get-url.outputs.MCP_OPTIMIZER_URL

4. Hardcoded timeout values lack context

  • Line 72: 180 second timeout for readiness
  • Line 97: 60 second timeout for ingestion
  • Line 160: 60 second health timeout
  • Line 238: 120 second test timeout
  • Consider making these configurable via env vars at workflow level

Minor Improvements

5. Python script should use project dependencies

  • Lines 102-126: Inline Python uses MCP SDK but doesn't declare it
  • Could fail if uv run doesn't have access to those packages
  • Consider: Pre-install dependencies or use a separate script file

6. Shell verbosity

  • Many echo statements for debugging (good for CI logs)
  • Could reduce some repetitive logging once stable

7. Magic number in log tails

  • Line 312, 316, 320, 323: Different tail values (100, 50)
  • Consider consistent value or make configurable

Best Practices Followed ✓

  • Pinned action versions with commit SHAs
  • Non-blocking tests with continue-on-error: true
  • Artifact uploads for debugging
  • Proper secrets handling
  • Clear step names and structure
  • Good timeout handling overall

Performance Note

Workflow appears appropriate for its purpose. The polling intervals (2s) and timeouts are reasonable for integration tests.

Verdict

Request changes - Address items 1-3 before merge. Items 4-7 are optional improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants