Skip to content

Conversation

@quanru
Copy link
Collaborator

@quanru quanru commented Nov 25, 2025

Summary

This PR refactors the MCP packages to use a simplified, consistent tool architecture across all platforms (Web, Android, iOS).

Key Changes

🏗️ Shared Infrastructure (@midscene/shared/mcp)

  • Created BaseMCPServer and BaseMidsceneTools base classes for code reuse
  • Implemented dynamic tool generation from agent.getActionSpace()
  • Simplified common tools to only take_screenshot and wait_for (removed assert)
  • Added unified tool generation pattern in tool-generator.ts

📦 Platform-Specific Packages

  • @midscene/mcp (Web): Simplified to single web_connect tool
  • @midscene/android-mcp (New): Android-specific MCP server with android_connect tool
  • @midscene/ios-mcp (New): iOS-specific MCP server with ios_connect tool

🎯 Simplified Tool Architecture

Each platform now provides exactly 3 types of tools:

  1. ActionSpace tools (dynamic): Generated from agent.getActionSpace()
    • Platform-specific interaction capabilities (tap, input, scroll, etc.)
  2. Common tools (2 tools):
    • take_screenshot: Capture current page/screen
    • wait_for: Wait until condition becomes true
  3. Platform connection (1 tool):
    • web_connect: Connect to web page
    • android_connect: Connect to Android device (optional deviceId and uri)
    • ios_connect: Connect to iOS device

🔧 Improvements

  • Naming Consistency: Unified {platform}_connect pattern across all platforms
  • Reduced Tool Bloat: Removed platform-specific helpers (list_devices, check_environment, etc.)
  • Visual Feedback: All connection tools return screenshots
  • Better Architecture: Shared base classes reduce code duplication

✅ Testing

  • Removed obsolete tool snapshot tests
  • Fixed tests to work with new dynamic tool architecture
  • All MCP package tests passing
  • Build and lint checks passing

Migration Notes

Tools have been consolidated:

  • open_newweb_connect
  • android_list_devices → Removed (use android_connect without deviceId)
  • android_launch → Merged into android_connect (use uri parameter)
  • ios_check_environment → Removed (just use ios_connect)
  • assert → Removed from common tools

Related Issues

Part of MCP architecture refactoring to support multiple platforms with consistent patterns.

🤖 Generated with Claude Code

@netlify
Copy link

netlify bot commented Nov 25, 2025

Deploy Preview for midscene ready!

Name Link
🔨 Latest commit e18c94b
🔍 Latest deploy log https://app.netlify.com/projects/midscene/deploys/6929819d61742f0008456457
😎 Deploy Preview https://deploy-preview-1507--midscene.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@quanru quanru requested a review from Copilot November 25, 2025 12:36
Copilot finished reviewing on behalf of quanru November 25, 2025 12:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the MCP (Model Context Protocol) server architecture to use a unified, dynamic tool generation pattern across all platforms (Web, Android, iOS), significantly reducing code duplication and improving maintainability.

Key Changes

  • Introduced shared base classes (BaseMCPServer and BaseMidsceneTools) in @midscene/shared/mcp to provide common infrastructure for all platform MCP servers
  • Implemented dynamic tool generation from agent.getActionSpace() instead of hardcoded tool definitions, making the architecture more flexible
  • Created platform-specific MCP packages: @midscene/android-mcp for Android automation and @midscene/ios-mcp for iOS automation, plus @midscene/web-mcp as an alias for the existing @midscene/mcp

Reviewed changes

Copilot reviewed 38 out of 39 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
packages/shared/src/mcp/base-server.ts New base MCP server class providing common launch() API and lifecycle management
packages/shared/src/mcp/base-tools.ts New base tools manager with dynamic tool initialization from action space
packages/shared/src/mcp/tool-generator.ts Core tool generation logic converting action space to MCP tool definitions
packages/shared/src/mcp/types.ts Type definitions for tool architecture
packages/shared/src/mcp/index.ts Export aggregation for shared MCP infrastructure
packages/shared/package.json Added MCP module export path and SDK dependency
packages/mcp/src/server.ts Refactored Web MCP server to extend BaseMCPServer
packages/mcp/src/web-tools.ts New Web-specific tools manager extending BaseMidsceneTools
packages/mcp/src/index.ts Simplified CLI entry to use new server architecture
packages/mcp/src/tools.ts Removed (replaced by dynamic tool generation)
packages/mcp/src/midscene.ts Removed (logic moved to base classes and web-tools)
packages/mcp/package.json Added exports field for server module
packages/mcp/tests/index.test.ts Removed obsolete tool snapshot tests
packages/android-mcp/* New Android MCP server package with android_connect tool
packages/ios-mcp/* New iOS MCP server package with ios_connect tool
packages/web-mcp/* New alias package re-exporting @midscene/mcp for naming consistency
pnpm-lock.yaml Updated dependencies for new packages
package.json Added new packages to test script
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported
Comments suppressed due to low confidence (4)

packages/android-mcp/rslib.config.ts:1

  • Unused import path.
import path from 'node:path';

packages/ios-mcp/rslib.config.ts:1

  • Unused import path.
import path from 'node:path';

packages/ios-mcp/src/ios-tools.ts:5

  • Unused import z.
import { z } from 'zod';

packages/shared/src/mcp/tool-generator.ts:29

  • Unused variable result.
      const result = await agent.action(`Use the action "${action.name}"`, {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@quanru quanru force-pushed the feat/mcp-simplify-tools branch from 9535f2f to 4613d05 Compare November 25, 2025 12:53
@quanru quanru requested a review from Copilot November 27, 2025 02:35
Copilot finished reviewing on behalf of quanru November 27, 2025 02:38
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 39 out of 40 changed files in this pull request and generated 7 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

quanru and others added 5 commits November 27, 2025 10:53
Refactored MCP packages to use a simplified, consistent tool
architecture across all platforms (Web, Android, iOS).

- Created base classes for MCP servers and tools
- Implemented dynamic tool generation from agent actionSpace
- Simplified common tools to only take_screenshot and wait_for
- Added unified tool generation pattern

- Web (@midscene/mcp): Simplified to single web_connect tool
- Android: Consolidated into android_connect with optional params
- iOS: Simplified to single ios_connect tool

Each platform now provides exactly 3 types of tools:
1. ActionSpace tools (dynamic): From agent.getActionSpace()
2. Common tools (2): take_screenshot, wait_for
3. Platform connection (1): {platform}_connect

- Unified naming pattern: {platform}_connect across all platforms
- Removed platform helpers (list_devices, check_environment, etc)
- All connection tools return screenshots for visual feedback

- Removed obsolete tool snapshot tests
- Fixed tests to work with new dynamic tool architecture
- All MCP package tests passing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
refactor(ios): remove unused path import in rslib.config
refactor(shared): change agent.action to agent.aiAction in tool-generator
@quanru quanru force-pushed the feat/mcp-simplify-tools branch from ff57aac to 9cfd87e Compare November 27, 2025 02:53
@quanru quanru requested a review from Copilot November 28, 2025 07:37
Copilot finished reviewing on behalf of quanru November 28, 2025 07:37
… and Web tools; streamline temporary device creation
@quanru quanru force-pushed the feat/mcp-simplify-tools branch from bec2957 to 8f47ac4 Compare November 28, 2025 08:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@quanru quanru requested a review from Copilot November 28, 2025 11:04
Copilot finished reviewing on behalf of quanru November 28, 2025 11:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 41 out of 45 changed files in this pull request and generated 6 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported
Comments suppressed due to low confidence (1)

packages/mcp/tests/index.test.ts:7

  • The test file imports deepMerge, getChromePathFromEnv, and getSystemChromePath from ../src/utils, but this file no longer exists in the packages/mcp package. These utilities have been moved to packages/web-mcp/src/utils.ts as part of the refactoring.

Since @midscene/mcp is now just a wrapper for @midscene/web-mcp, these tests should either be:

  1. Moved to the packages/web-mcp package where the actual implementation now lives, or
  2. Updated to import from @midscene/web-mcp if testing the wrapper functionality

The current import path will cause test failures.

import {
  deepMerge,
  getChromePathFromEnv,
  getSystemChromePath,
} from '../src/utils';

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +65 to +70
if (!this.puppeteerMode) {
if (!openNewTabWithUrl) {
throw new Error(
'Bridge mode requires a URL. Use web_connect tool to connect to a page first.',
);
}
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In bridge mode (when puppeteerMode is false), calling ensureAgent() without a URL parameter will throw an error: "Bridge mode requires a URL. Use web_connect tool to connect to a page first."

However, in packages/shared/src/mcp/base-tools.ts line 58, the initTools() method calls await this.ensureAgent() without any parameters during initialization. This means that when the server starts in bridge mode, initialization will fail.

The comment on line 64 mentions "agent creation will be deferred until first tool use", but the error is thrown immediately. Consider either:

  1. Returning null or a placeholder instead of throwing, to allow deferred initialization
  2. Updating the initTools() fallback logic to handle this case gracefully
  3. Always using the temporary device fallback for bridge mode during initialization

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +3
import { version } from '../package.json';
import { IOSMidsceneTools } from './ios-tools.js';
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent version handling: This file imports version from package.json while web-mcp/src/server.ts uses __VERSION__ (defined via rslib config).

For consistency with web-mcp and to avoid potential TypeScript issues with JSON imports, consider using the __VERSION__ constant which is already defined in the rslib config (rslib.config.ts line 7).

Change:

import { version } from '../package.json';

to:

declare const __VERSION__: string;

And use __VERSION__ instead of version on line 13.

Copilot uses AI. Check for mistakes.
cb();
},
'@silvia-odwyer/photon',
'@silvia-odwyer/photon-node',
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing externals configuration compared to web-mcp. The ios-mcp rslib config doesn't externalize:

  1. @modelcontextprotocol/sdk
  2. /^@midscene\/.*/ (workspace dependencies)

These are externalized in web-mcp and should be consistent across all MCP packages. Not externalizing workspace dependencies can lead to:

  • Larger bundle sizes
  • Duplicate code in the output
  • Potential version conflicts

Add these to the externals array:

externals: [
  // ... existing externals ...
  /^@midscene\/.*/,
  '@modelcontextprotocol/sdk',
],
Suggested change
'@silvia-odwyer/photon-node',
'@silvia-odwyer/photon-node',
/^@midscene\/.*/,
'@modelcontextprotocol/sdk',

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +29
"@midscene/android": "workspace:*",
"@midscene/core": "workspace:*",
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @midscene/android package is listed as a devDependency but is not imported or used anywhere in the web-mcp source code. This appears to be leftover from when the packages were combined.

Since web-mcp is specifically for web automation and doesn't need Android functionality, this dependency should be removed to keep the package lean.

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +3
import { version } from '../package.json';
import { AndroidMidsceneTools } from './android-tools.js';
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent version handling across MCP packages:

  • web-mcp/src/server.ts uses __VERSION__ (global defined via rslib)
  • android-mcp/src/server.ts and ios-mcp/src/server.ts import version from package.json

All three rslib configs define __VERSION__, so they should all use it consistently. Using the __VERSION__ constant (like web-mcp does) is preferable because:

  1. It's consistently defined in all rslib configs
  2. It avoids TypeScript issues with JSON imports
  3. It follows the same pattern as web-mcp

Consider updating android-mcp and ios-mcp to use __VERSION__ like web-mcp does.

Copilot uses AI. Check for mistakes.
cb();
},
'@silvia-odwyer/photon',
'@silvia-odwyer/photon-node',
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing externals configuration compared to web-mcp. The android-mcp rslib config doesn't externalize:

  1. @modelcontextprotocol/sdk
  2. /^@midscene\/.*/ (workspace dependencies)

These are externalized in web-mcp (lines 28-29) and should be consistent across all MCP packages. Not externalizing workspace dependencies can lead to:

  • Larger bundle sizes
  • Duplicate code in the output
  • Potential version conflicts

Add these to the externals array:

externals: [
  // ... existing externals ...
  /^@midscene\/.*/,
  '@modelcontextprotocol/sdk',
],
Suggested change
'@silvia-odwyer/photon-node',
'@silvia-odwyer/photon-node',
/^@midscene\/.*/,
'@modelcontextprotocol/sdk',

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants