Skip to content

docs: Add Code Mode design spec#794

Open
MQ37 wants to merge 2 commits into
masterfrom
docs/code-mode-spec
Open

docs: Add Code Mode design spec#794
MQ37 wants to merge 2 commits into
masterfrom
docs/code-mode-spec

Conversation

@MQ37
Copy link
Copy Markdown
Contributor

@MQ37 MQ37 commented May 5, 2026

closes #788

Context

Design spec for a new code MCP tool category: caller LLMs submit a TypeScript program that orchestrates Apify resources through typed bindings, executed in a workerd V8 isolate inside an Apify Standby Actor.

This addresses the round-trip cost of the current MCP tool surface — for tasks that need many sequential Actor / dataset operations, every intermediate result currently flows through the caller's LLM context.

Solution

  • New MCP tools: execute-code, get-code-mode-recipe.
  • New runtime: apify/code-runtime (Apify Standby Actor; workerd with the Worker Loader API spawning a fresh V8 isolate per request).
  • Binding surface: 23 methods, a simplified subset of apify-client that calls the Apify REST API directly via fetch() — keeps the runtime image small and presents an LLM-ergonomic API.
  • Existing MCP tools and discovery flow are unchanged.

Worth your attention

  • Design doc only. No code in this PR.
  • Architecture is PoC-validated in a separate repo (apify/actor-sub-agent-v8-standby): Worker Loader API works in self-hosted workerd with --experimental; ~700 ms warm latency; 208 MB image (debian-slim + workerd binary; no Node runtime, no Apify SDK).
  • Bindings deliberately drop apify-client — the npm dep is heavy and the API has rough edges that complicate LLM-generated code.
  • desiredRequestsPerActorRun: 1000 / maxRequestsPerActorRun: 10000 — single Standby instance handles all concurrent traffic; the platform should not spawn additional containers.

Follow-up

  • Implementation lands in separate PRs.
  • apify/code-runtime will be a separate repo.

@github-actions github-actions Bot added the t-ai Issues owned by the AI team. label May 5, 2026
@MQ37 MQ37 requested a review from jirispilka May 5, 2026 16:48
@jirispilka
Copy link
Copy Markdown
Collaborator

There is an issue with the most MCP clients as they have fixed timeout.
Typically about 30 seconds but it can also be 10 secs. It depends.

@jirispilka
Copy link
Copy Markdown
Collaborator

Let's discuss a new category, could be experimental, code-mode

Copy link
Copy Markdown
Collaborator

@jirispilka jirispilka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, I left couple comments

Let's define naming with @vystrcild and @jancurn because they will have an opinion too :)

  • tool category: code (code is pretty vague, I would prefer experimental, or sandbox)
  • execute-code tool (run-script sounds better, also aligns with run-actor)
  • get-code-mode-recipe (looks like as anti-pattern)

I would add tool get-script-result

@MQ37 let's wait for other comments.
I like that we have this in PR here in the md file. It is easier to comment. But once we agree I would transform it into github issue.

Comment thread res/code-mode.md

## MCP server changes

A new `code` tool category, opt-in only — not in `toolCategoriesEnabledByDefault`.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's decide on the tool category later, I guess Honza would love to chime in :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the code category, but let's wait for others.

Comment thread res/code-mode.md
| readOnlyHint | `false` |
| taskSupport | `optional` |

### Tool: `get-code-mode-recipe`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by this tools? Code snippets?

Isn't it just an example?

I would not really create a new tool. Sounds like an anti-pattern. We were there with the help tool.

I would rather provide examples in the code-execute tool or have examples as resources. That's the correct way

Copy link
Copy Markdown
Contributor Author

@MQ37 MQ37 May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the tool provides docs, examples and code snippets so the agent know how to write the code for specific use cases.

I am for including the examples in the resources, but I also strongly support keeping them also in the tools as a backup as not every client supports resources.

Also the I just noticed that when connecting to MCP server using mcpc it lists tools but not resources. I think we should also add listing resources similarly to tools when connecting/restarting. @jancurn


Tools (8):
* `search-actors (keywords?:str, limit?:int, offset?:int)` [read-only, idempotent]
* `fetch-actor-details (actor:str, output?:obj)` [read-only, idempotent]
* `call-actor (actor:str, input:obj, async?:bool, …)` [destructive, open-world, task:optional]
* `get-actor-run (runId:str)` [read-only, idempotent]
* `get-actor-output (datasetId:str, fields?:str, offset?:num, …)` [read-only, idempotent]
* `search-apify-docs (query:str, docSource?:enum, limit?:num, …)` [read-only, idempotent]
* `fetch-apify-docs (url:str)` [read-only, idempotent]
* `apify--rag-web-browser (query:str, maxResults?:int, outputFormats?:[enum])` [destructive, open-world, task:optional]

For full tool details and schema, run `mcpc @apify tools-list --full` or `mcpc @apify tools-get <name>`

# I would also add resources here
Resources (10):
- resouce1
- resource2
- ...

Comment thread res/code-mode.md
| Field | Value |
|---|---|
| Input | `{ code: string }` |
| Output | `content[0].text` = stdout + stderr (capped at `TOOL_MAX_OUTPUT_CHARS = 50 000`); `structuredContent: { runId, exitCode, durationMs, stdoutBytes, stderrBytes }` |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just thinking loud here, what about to split output into

  • result
  • stdout
  • stderr

Would this help to debug code better?

Also, we should return runId and datasetId or keyValueStoreId if the code run has created some data. If the data will be big you might need to off-load the data somewhere.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be provided in the result? Usually UNIX programs have stdin, stdout and stderr - result would be some kind of custom construct? What for?

Also, we should return runId and datasetId or keyValueStoreId if the code run has created some data. If the data will be big you might need to off-load the data somewhere.

Why would we do that? This part I would leave to agent, the agent has to decide if it wants or needs to runId or datasetId. Agent has freedom to write almost any code and print (return) whatever it wants - even the runId and datasetId or keyValueStoreId.

Comment thread res/code-mode.md

Topics: `running-actors`, `datasets`, `key-value-stores`.

## Runtime Actor (`apify/code-runtime`, new repo)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's decide on a name for this Actor, JC rule #1 :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I have no strong opinion and I am not planning to burn calories on naming 😁

Comment thread res/code-mode.md
- Two-stage Docker build:
- Stage 1 (`node:24-bookworm-slim`): install `[email protected]` via npm to extract the binary.
- Stage 2 (`debian:bookworm-slim`): minimal runtime; only the workerd binary + `ca-certificates`.
- workerd is statically linked except for libc + libm (verified via `ldd`); no Node.js, no Apify SDK, no apify-client.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider adding apify-cli as you would get access to all api?

Why not to use apify-client?

Copy link
Copy Markdown
Contributor Author

@MQ37 MQ37 May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apify-client is sometimes too complicated and too large, let's keep it simple and maintain our more efficient and agent friendly version. we can then learn from this and improve the apify client.

Comment thread res/code-mode.md
- ENTRYPOINT: `workerd serve --experimental /app/worker/config.capnp`.
- Image: 208 MB (validated; 410 MB with the Node base — the Node runtime is ~150 MB and unnecessary).

### Standby configuration
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actor must be with limited permission!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Comment thread res/code-mode.md

## Bindings exposed to the script

The `apify` global is a simplified subset of `apify-client` that calls the Apify REST API directly via `fetch()`. This avoids bundling the apify-client npm dependency (which adds image weight and a transitive tree) and presents an LLM-ergonomic surface.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The must be really strong reason why not to use apify-client.

Let's not re-implement apify-client all over again.

It is expensive over time. apify-client is small; the real cost of skipping it is the maintenance treadmill of REST API

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not re-implement apify-client all over again.

apify-client is not written for AI agents, I think we should experiment more and try to build simpler interface for agents. We will not be implementing the apify-client all over again, it is too complex and contains a lot of stuff we do not actually need - only a tiny and simplified subset.

It is expensive over time

I think the flexibility is worth it, I dont know what is "maintenance treadmill" but REST API is usually kept backwards compatible and this was we have greater control and more space to experiment and do stuff more efficiently (sometimes even the apify-client does not expose all the fields in the endpoint and you need to do some hacky weird assertions).

Also there is some technical issue, when I tried running apify-client in workerd (keep in mind this is not nodejs runtime equivalent). Here is report from my clanker and I would not burn time on trying to solve this if we can just keep it simple and use the REST API:

apify-client's transitive deps (ow, @apify/utilities, @apify/log) are CJS modules that do bare require("fs") / require("crypto") / require("stream"), and when esbuild bundles them as  
 ESM it rewrites those calls through its __require shim — which throws Dynamic require of "fs" is not supported at module init because workerd ESM modules don't expose require as a     
 global. nodejs_compat doesn't save this: it polyfills the node:fs specifier, not the bare fs one the bundled CJS shim asks for, so making apify-client run would require a vendored,    
 patched fork of it plus its transitive deps — i.e. exactly the "re-implementing apify-client" the reviewer wanted to avoid.

Comment thread res/code-mode.md

## Summary

A new `code` tool category in mcp.apify.com lets the caller LLM submit a TypeScript program that orchestrates Apify resources through typed bindings. The program runs in a `workerd` V8 isolate hosted in a new Apify Actor (`apify/code-runtime`) running in Standby mode with the Worker Loader API. Two new MCP tools: `execute-code` and `get-code-mode-recipe`. Existing tools unchanged. Discovery (`search-actors`, `fetch-actor-details`, `search-apify-docs`) stays on existing MCP tools.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's decide on the code category

And also on get-code-mode-recipe

@vystrcild
Copy link
Copy Markdown
Collaborator

I agree with Jirka,

don't introduce new category. We already have Experimental features, so use that.

About execute-code - I prefer code over script. It's commonly used in this niche - Claude Code, code generation, Code mode, and also it implicates shell/bash. Maybe run-code? On the other hand - I agree, that it's kinda poetic to call it script since it's have meaning for Actors and acting.

get-code-mode-recipe - this is for fetching the right code snippet for the proper action right? Just curious - isn't possible to store it as resources? Or that's nonsense?

@MQ37
Copy link
Copy Markdown
Contributor Author

MQ37 commented May 15, 2026

@jirispilka could you elaborate more on the get-script-result? How would that work?

Agree, let's make the final version an issue.

@MQ37
Copy link
Copy Markdown
Contributor Author

MQ37 commented May 15, 2026

@vystrcild I also like the poetic meaning of script and how it connects with Actors, great point!

get-code-mode-recipe - this is for fetching the right code snippet for the proper action right? Just curious - isn't possible to store it as resources? Or that's nonsense?

We can store it also in resources but I would also keep that in tool as some clients may not support resources well.

@jancurn
Copy link
Copy Markdown
Member

jancurn commented May 15, 2026

Hey guys, is there some design doc/spec for this?

@jirispilka
Copy link
Copy Markdown
Collaborator

Hey guys, is there some design doc/spec for this?

@jancurn this design is in this, see changes: https://github.com/apify/apify-mcp-server/pull/794/changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-ai Issues owned by the AI team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sub-agent specification

5 participants