Skip to content

feat: add deterministic Markdown endpoint for tabs#3278

Closed
linouxis9 wants to merge 5 commits into
jo-inc:masterfrom
linouxis9:feat/markdown-endpoint
Closed

feat: add deterministic Markdown endpoint for tabs#3278
linouxis9 wants to merge 5 commits into
jo-inc:masterfrom
linouxis9:feat/markdown-endpoint

Conversation

@linouxis9
Copy link
Copy Markdown

@linouxis9 linouxis9 commented May 16, 2026

Summary

Adds GET /tabs/:tabId/markdown for deterministic Markdown rendering of the current tab, backed by Playwright accessibility snapshots.

  • Adds view=document (default): readable Markdown without refs/control noise.
  • Adds view=agent: agent-friendly Markdown preserving actionable refs, controls, states, and values.
  • Keeps the existing GET /tabs/:tabId/snapshot behavior intact for backward compatibility.
  • Adds Markdown-specific offset pagination and per-view cache entries.
  • Uses Playwright AI aria snapshots where available for ref-capable snapshots.
  • Raises playwright-core to ^1.59.1, for page.ariaSnapshot({ mode: 'ai' }) support.
  • Updates OpenAPI, README docs, and test scripts so unit/non-e2e and e2e suites run through the correct Jest configs.

Why

Raw accessibility snapshots are useful for automation but noisy for reading/extraction. This endpoint gives callers a stable Markdown surface without forcing consumers to post-process snapshot YAML themselves, while view=agent preserves actionability for future agent workflows.

API

GET /tabs/:tabId/markdown?userId=<id>&view=document&offset=0
GET /tabs/:tabId/markdown?userId=<id>&view=agent&offset=0

Response shape mirrors snapshot pagination fields:

{
  "url": "https://example.com",
  "view": "document",
  "markdown": "# Example\n\n...",
  "refsCount": 12,
  "truncated": false,
  "totalChars": 1234,
  "offset": 0,
  "hasMore": false,
  "nextOffset": null
}

view=document: clean, no refs

# Example Domain

This domain is for use in documentation examples without needing permission.

[Learn more](https://iana.org/domains/example)
# Markdown - Wikipedia

[Jump to content](#bodyContent)

## Header
- [Wikipedia The Free Encyclopedia](/wiki/Main_Page), [Donate](https://donate.wikimedia.org/...), [Create account](...), [Log in](...)

## Contents
- [(Top)](#)
- [History](#History)
- [Rise and divergence](#Rise_and_divergence)
- [CommonMark](#CommonMark)
- [Variants](#Variants)
  - [GitHub Flavored Markdown](#GitHub_Flavored_Markdown)
  - [Markdown Extra](#Markdown_Extra)
...

# Markdown
From Wikipedia, the free encyclopedia

Markdown is a lightweight markup language for creating formatted text using a plain-text editor.
John Gruber created Markdown in 2004 as an easy-to-read markup language.

| Markdown |  |
| --- | --- |
| Filename extensions | `.md`, `.markdown` |
| Internet media type | `text/markdown` |
| Developed by | John Gruber |
...

view=agent — refs and controls preserved

# Example Domain

This domain is for use in documentation examples without needing permission.

[Learn more](https://iana.org/domains/example)[e1]
# Markdown - Wikipedia

[Jump to content](#bodyContent)[e1]

## Header
- <button "Main menu" [e2] cursor=pointer>, [Wikipedia The Free Encyclopedia](/wiki/Main_Page)[e3],
  <searchbox "Search Wikipedia" [e4]>, <button "Search" [e5]>, [Donate](...)[e6], ...

## Contents
<button "hide" [e9] cursor=pointer>
- [(Top)](#)[e10]
- [History](#History)[e11]
- [Variants](#Variants)[e14] <button "Toggle Variants subsection" [e15] expanded>
  - [GitHub Flavored Markdown](#GitHub_Flavored_Markdown)[e16]
  - [Markdown Extra](#Markdown_Extra)[e17]
...

# Markdown
<button "Go to an article in another language. Available in 33 languages" [e24] cursor=pointer>

## Navigation
- [Article](/wiki/Markdown)[e25], [Talk](/wiki/Talk:Markdown)[e26]
- [Read](/wiki/Markdown)[e27], [Edit](/w/index.php?title=Markdown&action=edit)[e28],
  [View history](/w/index.php?title=Markdown&action=history)[e29]
...
## Header
- <button "Main menu" [e2] cursor=pointer>, [Wikipedia The Free Encyclopedia](/wiki/Main_Page)[e3],
  <searchbox "Search Wikipedia" [e4]>, <button "Search" [e5]>

/snapshot: unchanged (raw aria YAML)

- heading "Example Domain" [level=1]
- paragraph: This domain is for use in documentation examples...
- paragraph:
  - link "Learn more" [e1] :
    - /url: https://iana.org/domains/example

Implementation

Renderer (lib/markdown.js, ~1,400 lines):

  1. Parses Playwright AI aria snapshot YAML into a role tree.
  2. Annotates list nesting, table headers, and code languages.
  3. Renders per role — inline vs block, ref/control emission controlled by view.
  4. Postprocesses: normalizes whitespace, deduplicates adjacent lines, strips skip-to-content cruft.
  5. Windows for offset pagination with Markdown-specific truncation markers.

Design choices:

  • view is an enum, not a toggle. agent mode renders inline controls with states/values, keeps interactive chrome visible, and marks un-annotated elements no ref.
  • Refs are never invented. Reuses the existing snapshot ref-annotation pipeline (buildRefsannotateAriaSnapshotWithRefs).
  • Separate caches. tabState.lastMarkdown is a per-view Map; snapshot and markdown pagination never cross-contaminate.
  • Compact chrome. Regions like nav, header, and footer fold into ## Sidebar: [link1], [link2] lines.

Route (server.js):

  • Refactored shared ref-annotation helpers (parseSnapshotRoleLine, annotateAriaSnapshotWithRefs, buildAnnotatedAriaSnapshot) used by both /snapshot and /markdown.
  • getAriaSnapshot accepts { mode }; mode='ai' tries page.ariaSnapshot({ mode: 'ai' }) with body-locator fallback.
  • clearRenderedContentCaches(tabState) replaces scattered lastSnapshot = null assignments across navigate, click, type, scroll, press, back, forward, refresh, and viewport.
  • Google SERP handling: reuses extractGoogleSerp directly.

Testing

node --check server.js
node --check lib/markdown.js
NODE_OPTIONS='--experimental-vm-modules' npx jest tests/unit/markdown.test.js tests/unit/openapi.test.js --runInBand --forceExit
NODE_OPTIONS='--experimental-vm-modules' npx jest --config jest.config.e2e.cjs tests/e2e/markdown.test.js tests/e2e/playwrightAiSnapshotCapabilities.test.js --runInBand --forceExit
npm test
npm run test:e2e

Unit tests (tests/unit/markdown.test.js: 22 tests): empty input, common shapes (headings, paragraphs, links, images, lists, tables, code), document ref/control stripping, agent ref/control preservation, YAML-quoted keys, legacy @e refs, safe code fences, inferred table headers, layout table flattening, GFM task lists, private-use glyph filtering, duplicate line collapsing.

E2E tests (tests/e2e/markdown.test.js: 7 tests): document default, agent view, invalid view → 400, /snapshot backcompat, offset pagination with Markdown markers, per-view cache isolation, cache invalidation on navigation.

AI snapshot capability (tests/e2e/playwrightAiSnapshotCapabilities.test.js): verifies page.ariaSnapshot({ mode: 'ai' }) is available and produces [ref=eN] annotations.

Backwards compatibility

  • /tabs/:tabId/snapshot: unchanged. Response shape, query params, and pagination behavior preserved.
  • Internal ref-annotation helper extraction is behavior-preserving.
  • No existing routes modified in breaking ways.

@skyfallsin
Copy link
Copy Markdown
Contributor

Hey @linouxis9 — thanks for this, it's really well done. The dual-view approach (document vs agent), the offset pagination, the caching — you clearly thought through how LLMs actually consume page content. The rendering quality is impressive too.

After reviewing, our concern is maintenance surface area. A 1400-line Markdown renderer becomes something we'd need to keep in sync with Playwright's evolving aria snapshot format, handle edge cases across sites, and field "this page renders wrong" issues indefinitely. For a project this size, that's a meaningful commitment for core.

Would you be open to restructuring this as a standalone plugin in its own repo? Our plugin system (plugins/ directory, browser:launching / session:creating hooks, or you could register Express routes via the plugin API) would let you ship this independently. Benefits:

  • You maintain it at your own pace and release cycle
  • Users install it like any other plugin (plugins/markdown/)
  • We'd happily link to it from our README under a "Community Plugins" section

The plugin API gives you access to everything you need — app (Express instance), ctx.sessions, tab state, and the snapshot data. Take a look at lib/plugins.js for the full hook/event reference (29 events across 7 categories), and the existing plugins/persistence/ and plugins/youtube/ as examples of how to register routes and hook into lifecycle events.

Really appreciate the contribution. This is the kind of thing that makes camofox better for everyone — we just think it lives best as a plugin rather than core.

@skyfallsin skyfallsin closed this May 24, 2026
@linouxis9
Copy link
Copy Markdown
Author

Hey @skyfallsin, thanks for the detailed review and the kind words. I appreciate the concern about maintenance surface area, it's a fair point to raise.

A few things I'd like to push back on:

  • On the format stability concern: The Playwright aria snapshot format is specifically designed for snapshot testing: web developers commit these YAML snapshots to source control and rely on them being stable across Playwright versions.
    If the format changed, every existing snapshot test across the web ecosystem would break. This isn't an internal implementation detail that Playwright might refactor casually; it's a public testing contract. The renderer parses a well-defined YAML grammar (role, name, attributes) that maps directly to ARIA semantics, which are themselves standardized by W3C. The format drift risk is low.

  • On why this belongs in core rather than as a plugin: Projects like Hermes Agent that build on camofox's browser capabilities integrate directly with the core API surface (/snapshot, /click, /navigate, etc.). If the Markdown endpoint lives behind an optional plugin, downstream projects have a strong incentive not to depend on it — they can't assume their users will install it, which means they'll keep consuming raw aria snapshots and building their own post-processing instead.
    The whole novel value proposition: cleaner, more token-efficient page content for LLM agents — only works if it's available by default. Markdown snapshots are significantly more readable and token-efficient than raw aria YAML for agent workflows, and that's a core use case for camofox users.

  • I'm happy to help maintain this. If the concern is "who fields the issues when something breaks," I'm volunteering. I use this code daily and have a strong incentive to keep it working. I'm open to being added as a maintainer for this area, or we can set up a CODEOWNERS rule so I get pinged on any issues touching the Markdown renderer.

  • Would you be open to reopening this for discussion? I think the feature serves camofox's core audience well, and I'd rather find a way to make it work in core than fragment the ecosystem with optional plugins that downstream projects can't reliably depend on.

Thanks and cheers,
Valentin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants