feat: add deterministic Markdown endpoint for tabs#3278
Conversation
|
Hey @linouxis9 — thanks for this, it's really well done. The dual-view approach (document vs agent), the offset pagination, the caching — you clearly thought through how LLMs actually consume page content. The rendering quality is impressive too. After reviewing, our concern is maintenance surface area. A 1400-line Markdown renderer becomes something we'd need to keep in sync with Playwright's evolving aria snapshot format, handle edge cases across sites, and field "this page renders wrong" issues indefinitely. For a project this size, that's a meaningful commitment for core. Would you be open to restructuring this as a standalone plugin in its own repo? Our plugin system (
The plugin API gives you access to everything you need — Really appreciate the contribution. This is the kind of thing that makes camofox better for everyone — we just think it lives best as a plugin rather than core. |
|
Hey @skyfallsin, thanks for the detailed review and the kind words. I appreciate the concern about maintenance surface area, it's a fair point to raise. A few things I'd like to push back on:
Thanks and cheers, |
Summary
Adds
GET /tabs/:tabId/markdownfor deterministic Markdown rendering of the current tab, backed by Playwright accessibility snapshots.view=document(default): readable Markdown without refs/control noise.view=agent: agent-friendly Markdown preserving actionable refs, controls, states, and values.GET /tabs/:tabId/snapshotbehavior intact for backward compatibility.playwright-coreto^1.59.1, forpage.ariaSnapshot({ mode: 'ai' })support.Why
Raw accessibility snapshots are useful for automation but noisy for reading/extraction. This endpoint gives callers a stable Markdown surface without forcing consumers to post-process snapshot YAML themselves, while
view=agentpreserves actionability for future agent workflows.API
Response shape mirrors snapshot pagination fields:
{ "url": "https://example.com", "view": "document", "markdown": "# Example\n\n...", "refsCount": 12, "truncated": false, "totalChars": 1234, "offset": 0, "hasMore": false, "nextOffset": null }view=document: clean, no refsview=agent— refs and controls preserved/snapshot: unchanged (raw aria YAML)Implementation
Renderer (
lib/markdown.js, ~1,400 lines):view.Design choices:
viewis an enum, not a toggle.agentmode renders inline controls with states/values, keeps interactive chrome visible, and marks un-annotated elementsno ref.buildRefs→annotateAriaSnapshotWithRefs).tabState.lastMarkdownis a per-viewMap; snapshot and markdown pagination never cross-contaminate.## Sidebar: [link1], [link2]lines.Route (
server.js):parseSnapshotRoleLine,annotateAriaSnapshotWithRefs,buildAnnotatedAriaSnapshot) used by both/snapshotand/markdown.getAriaSnapshotaccepts{ mode };mode='ai'triespage.ariaSnapshot({ mode: 'ai' })with body-locator fallback.clearRenderedContentCaches(tabState)replaces scatteredlastSnapshot = nullassignments across navigate, click, type, scroll, press, back, forward, refresh, and viewport.extractGoogleSerpdirectly.Testing
Unit tests (
tests/unit/markdown.test.js: 22 tests): empty input, common shapes (headings, paragraphs, links, images, lists, tables, code), document ref/control stripping, agent ref/control preservation, YAML-quoted keys, legacy@erefs, safe code fences, inferred table headers, layout table flattening, GFM task lists, private-use glyph filtering, duplicate line collapsing.E2E tests (
tests/e2e/markdown.test.js: 7 tests): document default, agent view, invalid view → 400,/snapshotbackcompat, offset pagination with Markdown markers, per-view cache isolation, cache invalidation on navigation.AI snapshot capability (
tests/e2e/playwrightAiSnapshotCapabilities.test.js): verifiespage.ariaSnapshot({ mode: 'ai' })is available and produces[ref=eN]annotations.Backwards compatibility
/tabs/:tabId/snapshot: unchanged. Response shape, query params, and pagination behavior preserved.