Skip to content

Provide high-level interface to Server #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mattt opened this issue Mar 27, 2025 · 3 comments
Open

Provide high-level interface to Server #27

mattt opened this issue Mar 27, 2025 · 3 comments
Labels
enhancement New feature or request

Comments

@mattt
Copy link
Contributor

mattt commented Mar 27, 2025

The current implementation of Server focuses on a low-level interface. API consumers create a server instance with defined capabilities, call withMethodHandler and pass a closure that's called in response to requests of the specified type. This is in line with the TypeScript SDK's low-level interface.

It'd be nice to provide a simpler, high-level API for use cases that don't require this level of control. Looking again at the TS SDK, we could provide an alternative interface that let API consumers specify the prompts, resources, and/or tools to serve, and let the implementation take care of the rest (including automatically determining capabilities #12).

@mattt
Copy link
Contributor Author

mattt commented Mar 27, 2025

Thinking out loud: How should the high- and low-level APIs should interact?

It would be surprising (bad) if the API consumer could inadvertently wipe out tools configured with the high-level API by calling withMethodHandler.

Also, setting tools with the high-level API impacts the method handlers for both tools/list and tools/call (and notifications/tools/list_changed for that matter), so overriding just the one would create a contradiction.

What if we totally isolated the high-level API from the low-level API? This would solve the split brain problem, but I'm wary of APIs that don't provide an escape hatch. But is that Yagni? Can we imagine a situation where someone might want to, for example, define a fixed set of tools with dynamic responses to resources? That sounds plausible. But then again, maybe that's the price of admission.

There might be a clever way to cut the knot, and solve the contradiction another way. For example:

  • An initializer that takes a Server instance that let you eject from high-level API to low-level APIs
  • Exposing high-level API objects so that you can DIY method handlers
  • Layering high on low in a way that somehow makes it more difficult to hold wrong

@mattt mattt added the enhancement New feature or request label Mar 27, 2025
@sebsto
Copy link

sebsto commented Apr 1, 2025

One of the challenge of doing so is to provide a Swift compliant set of types to expose the tools.
I did a very quick exercise this afternoon to wrap your swift-sdk in higher level constructs.

The wrapper is here https://gist.github.com/sebsto/9cdc1bfec3ab905c8cb037b167373f5f

The client code is below

import MCP

#if canImport(FoundationEssentials)
  import FoundatioNEssentials
#else
  import Foundation
#endif

let myCosineToolSchema =
  """
  {
      "type": "object",
      "properties": {
        "angle": {
          "description": "The angle to compute the cosine, expressed in degrees",
          "type": "number"
        }
      },
      "required": [
        "angle"
      ]
    }
  """

let myToolDescription =
  "This tool compute a cosine. This is a trigonometric function that relates the adjacent side of a right triangle to its hypotenuse. The angle value is expected in degrees."

let myCosineTool = ClosureMCPTool(
  name: "cosine",
  description: myToolDescription,
  inputSchema: myCosineToolSchema
) { (input: Double) async throws -> Double in

  // return the cosine of the angle received in degrees (converted to radians)
  return cos(input * .pi / 180.0)
}

try await withMCPServer(name: "MyMCPServer", version: "1.0.0", tool: myCosineTool) { params in

  // check if we received an "angle" parameter
  guard let value = params.arguments?["angle"] else {
    throw MCPError.missingParam("angle")
  }

  // extract the double value from the "angle" parameter
  // the MCP explorer sends a "number" which is interpreted as an int or a double depending on the presence of a decimal part
  var param: Double = 0.0
  switch value {
  case .double(let d):
    param = d
  case .int(let i):
    param = Double(i)
  default:
    throw MCPError.invalidParam("angle", "\(value)")
  }

  return param

}

What I like is that it allows to expose tools as closures.
But there are many aspects I don't like in this solution.

  • the representation of the schema is not type safe (I know we can use your Value struct but I wanted to stay close to what developer use, esp. if they have an OpenAPI definition of their tool already)
  • the developer still must convert the parameters arguments to Swift type. Ideally the MCP low-level package should not
    be exposed to developers.
  • we need to be able to provide multiple tools in one server

@mattt
Copy link
Contributor Author

mattt commented Apr 15, 2025

@sebsto Thanks for taking the time to go through that exercise and share your experience. I agree that a big challenge is figuring out how to translate type-safe Swift function signatures to the JSON Schema used by MCP.

The way I'm thinking about this is that there are three distinct jobs to be done:

  1. Express an MCP tool's description / interface
  2. Implement the Swift function for that tool
  3. Manage the execution of running the tool

I've seen a few different approaches to this in Swift, both for MCP and for tool-calling for Ollama and other inference providers. And the biggest problem I've seen with them is that they conflate 1 and 2 and don't account for 3.

For example, I can think of a couple libraries that have a @Tool macro that let you do something like this 1:

/// Add two numbers together
/// - Parameter x: The first number
/// - Parameter y: The second number
@Tool
func add(x: Int, y: Int) -> Int {
    return x + y
}

Setting aside the friction introduced by Swift macros, this approach promises to eliminate all boilerplate. You write Swift code and it magically goes through MCP.

This works great for pure functions / toy examples like add. But this approach gets messy as soon as you introduce dependencies (API clients, databases, file I/O, etc.)

I'm still chewing on this part of the design problem, but I take inspiration from @pointfreeco's approach of using structures with closure properties instead of protocols for dependencies.

If nothing else, I think that pattern would work well for tool execution. "Should I run this tool?" is a policy decision that can be informed by tool annotations (#47) and controlled by "human-in-the-loop". This would also be a natural place to put logic for how to route named tools to their respective implementations.

// Structure as dependency
struct Runner { 
  var run: (_ tool: Tool, input: CallTool.Parameters) async -> CallTool.Result
}

// Example helper method
func createRunner(
  for tools: [Tool: (CallTool.Parameters) async -> CallTool.Result],
  approval: (CallTool.Parameters) async -> Bool
) -> Runner { /* ... */ }

Again, still figuring all this out.

Footnotes

  1. https://github.com/loopwork-ai/ollama-swift/pull/3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants