Skip to content

Conversation

@Genteki
Copy link
Contributor

@Genteki Genteki commented Oct 13, 2025

Original #156

A hud remote-browser based environment for Online-Mind2Web dataset.


Note

Introduces a new Online-Mind2Web environment with a remote-browser MCP server, multi-provider support, setup/eval tools, recording, and Docker packaging.

  • Environment: environments/online_mind2web
    • Adds pyproject, Dockerfile, README, .gitignore, and test_task.json.
  • MCP Server (src/hud_controller/server.py)
    • Boots a remote-browser MCP server, attaches a persistent context, initializes provider, Playwright, and computer tools; exposes telemetry and graceful shutdown.
  • Persistent Context (context.py)
    • New context server to persist browser/provider state, telemetry, and Playwright handle across hot-reloads.
  • Providers (providers/)
    • Implements AnchorBrowserProvider, BrowserBaseProvider, HyperBrowserProvider, SteelProvider with launch/close/status APIs and live view URLs; registry and proxy helper.
  • Tools (tools/)
    • BrowserExecutor: maps computer actions to Playwright.
    • AnthropicComputerToolWithRecord and OpenAIComputerToolWithRecord: add screenshot/history recording to /screenshot and /action_history.
  • Setup Hub (setup/)
    • Tools for navigation, cookies, and basic interactions (navigate_to_url, set_cookies/clear_cookies, click_element/fill_input/select_option).
  • Evaluation Hub (evaluate/)
    • webjudge (multi-screenshot + keypoint analysis via GPT-4o), autonomous_eval (single-screenshot VLM check), and overall_judge (aggregate) returning EvaluationResult.
  • Problems (problems/)
    • Registry/decorator plus sample problems: navigate/verify, form interaction, button click, Google search.
  • Packaging & Run
    • Docker image to start context + MCP server; script entry hud-om2w; instructions to eval local JSON or HuggingFace dataset.

Written by Cursor Bugbot for commit f70d067. This will update automatically on new commits. Configure here.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@promptless
Copy link
Contributor

promptless bot commented Oct 13, 2025

📝 Documentation updates detected!

New suggestion: Add comprehensive Online-Mind2Web environment documentation for PR #168
Updated existing suggestion: Add comprehensive Mind2Web evaluation documentation (updated for PR #156)

cursor[bot]

This comment was marked as outdated.

@Genteki Genteki changed the title Online-Mind2Web Folder New Env: Online-Mind2Web Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant