Skip to content

act-receipts/: scraper-mcp.act_receipt.v1 schema + Python/JS reference impls#58

Merged
g-shevchenko merged 1 commit into
mainfrom
feature/act-receipts-v1
May 25, 2026
Merged

act-receipts/: scraper-mcp.act_receipt.v1 schema + Python/JS reference impls#58
g-shevchenko merged 1 commit into
mainfrom
feature/act-receipts-v1

Conversation

@g-shevchenko

Copy link
Copy Markdown
Owner

Summary

Companion code for the part-2 article on cache-friendly action receipts:
👉 https://gregshevchenko.com/research/mcp-stack-token-economy-part-2/

A cache-friendly action receipt is what a browser MCP returns after performing a user-shaped action (click, type, submit, scroll), instead of the full DOM. The receipt's byte representation stays stable across calls when the page region the agent acted on didn't change — so Anthropic's 5-minute prompt cache can hit on the next agent turn.

What ships in act-receipts/

Path Purpose
schema/act_receipt_v1.json JSON Schema (Draft-07)
schema/act_receipt_v1.md Human-readable spec
python/act_receipts.py Reference impl: validate / canonicalize / hash / score
python/artifact_guard.py detect_selector_miss_artifact() — useful for any CSS-selector LLM eval
js/canonical_bytes.mjs JS port — byte-equivalent with the Python impl
python/tests/ + js/canonical_bytes.test.mjs Cross-runtime equivalence (Python generated the goldens; JS asserts the same SHA-256)
scenarios/ Three A/B scenarios from the article (iana / /test/jitter / HN)
examples/jitter_endpoint.py Minimal /test/jitter to reproduce AB2 locally

Test status

Suite Result
python -m unittest discover -s tests 23/23 PASS (5 test classes)
node --test canonical_bytes.test.mjs 7/7 PASS (Node 20+)
Cross-runtime byte equivalence VERIFIED — same canonical bytes + same SHA-256 in both runtimes on the shared fixture file

Article TL;DR

Three honest measurements on a real browser-MCP:

  • AB1 (iana.org, static): 0pp delta. Generic strip is already perfect on static targets — invariant proven across JS + Python + Camoufox runtimes.
  • AB2 (/test/jitter, controlled noise): +80pp delta. Mechanism works on documented DOM jitter.
  • AB3 (Hacker News, real prod, N=20): mixed — unique_hashes 2 vs 3 (+5pp by count), modal-fraction -25pp (metric artifact at bimodal N=20), and wall_ms +3017ms mean (+37%) / +7.7s p95 — real overhead.

Default-on stays OFF in our own production stack. The receipt is an opt-in tool for sites with documented DOM noise.

Bonus: an artifact-postmortem where a first reading of +77.8pp turned out to be a CSS selector miss caught by deterministic Python repro. There's now a permanent detect_selector_miss_artifact() guard — useful to anyone running LLM evals against CSS-selectored regions.

Credits

  • LakshmanTurlapati/FSB (BSL 1.1) — architectural inspiration. Adopted the shape, not the code; this is an independent clean implementation.
  • u/pquattro on r/ClaudeAI — feedback on part-1 cache-friendliness framing that pushed us to measure the browser-MCP layer too.

License

MIT (matches repo root).

Why here, not a separate repo

Continuation of the two-axis framework from part 1, which lives in this repo's README + benchmark/. One canonical destination is better than two for discovery.

Status

Reference implementation, not a published package. Copy what you need; the article documents the boundary of applicability.

… impls

Companion code for the part-2 article on cache-friendly action receipts:
https://gregshevchenko.com/research/mcp-stack-token-economy-part-2/

A cache-friendly action receipt is what a browser MCP returns after performing
a user-shaped action (click, type, submit, scroll), instead of the full DOM.
The receipt's byte representation stays stable across calls when the page
region the agent acted on didn't change — so Anthropic's 5-minute prompt cache
can hit on the next agent turn.

Contents:
- schema/act_receipt_v1.json — JSON Schema (Draft-07)
- schema/act_receipt_v1.md — human-readable spec
- python/act_receipts.py — Python reference: validate / canonicalize / hash / score
- python/artifact_guard.py — detect_selector_miss_artifact() guard pattern
- js/canonical_bytes.mjs — JS port (byte-equivalent with Python)
- python/tests/ + js/canonical_bytes.test.mjs — cross-runtime equivalence proven
  on 4 fixtures (Python generated the goldens; JS asserts the same SHA-256)
- scenarios/ — the 3 A/B scenarios from the article (iana / jitter / HN)
- examples/jitter_endpoint.py — minimal /test/jitter to reproduce AB2 locally

Test status:
- python: 23/23 PASS (5 test suites)
- js: 7/7 PASS (Node 20+ --test runner)
- Cross-runtime byte equivalence VERIFIED — same canonical bytes + same SHA-256
  in both runtimes on the shared fixture file

License: MIT (matches repo root)

Credits:
- LakshmanTurlapati/FSB (BSL 1.1) — architectural inspiration; adopted the
  shape, not the code, so this is an independent clean implementation
- u/pquattro on r/ClaudeAI — feedback on part-1 cache-friendliness framing
  that pushed us to measure the browser-MCP layer too

Status: reference implementation, NOT a published npm/PyPI package. Copy what
you need; the article documents the boundary of applicability. Default-on
stays OFF in our own production stack — wall-time cost on real-prod targets
is too steep for a global default.
@g-shevchenko g-shevchenko merged commit a42e509 into main May 25, 2026
6 checks passed
@g-shevchenko g-shevchenko deleted the feature/act-receipts-v1 branch May 25, 2026 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant