Skip to content

chore: add runtime and desktop regression coverage#941

Open
lefarcen wants to merge 10 commits intomainfrom
test/fill-runtime-desktop-gaps
Open

chore: add runtime and desktop regression coverage#941
lefarcen wants to merge 10 commits intomainfrom
test/fill-runtime-desktop-gaps

Conversation

@lefarcen
Copy link
Copy Markdown
Collaborator

@lefarcen lefarcen commented Apr 8, 2026

What

Add a focused batch of regression coverage for the highest-risk runtime and desktop flows that changed after v0.1.10, and optimize CI so local/default test runs stay fast while PR/CI still execute the right coverage for the files that changed.

Why

Recent post-release fixes touched restart prevention, BYOK model preservation, launchd disabled overrides, credit-guard error replacement, mac desktop layout, local dev stale-port recovery, and skill watcher reconciliation. Those fixes are valuable precisely because they protect against subtle regressions, so they need direct tests rather than relying on indirect coverage. At the same time, the test suite has grown enough that we should stop running the slowest desktop/e2e paths on every PR when the changed files do not justify them.

How

  • Add runtime stability regression tests for deterministic plugins.allow ordering and BYOK model preservation when a configured provider has an empty model allowlist.
  • Add controller runtime-plugin tests for nexu-credit-guard channel isolation and stale-cache eviction.
  • Add desktop launchd tests to verify disabled launchd overrides are cleared even when plist content is unchanged.
  • Add web desktop-platform tests for the mac sidebar/traffic-light layout split.
  • Add scripts/dev service tests covering stale web/controller port cleanup before startup.
  • Add desktop link tests covering host-bridge external-link/local-folder fallback behavior.
  • Harden SkillDirWatcher with a low-frequency reconciliation fallback so missed fs events do not leave the ledger stale.
  • Split Vitest into core and extended layers, add a PR-aware test-selection script, and gate Desktop E2E so expensive suites/coverage only run for relevant changes while push/main still keeps full coverage.

Affected areas

  • Desktop app (Electron shell)
  • Controller (backend / API)
  • Web dashboard (React UI)
  • OpenClaw runtime
  • Skills
  • Shared schemas / packages
  • Build / CI / Tooling

Checklist

  • pnpm typecheck passes
  • pnpm lint passes
  • pnpm test passes
  • pnpm generate-types run (if API routes/schemas changed)
  • No credentials or tokens in code or logs
  • No any types introduced (use unknown with narrowing)

Add regression tests for runtime stability fixes, launchd disabled-override recovery, credit-guard channel isolation, mac desktop sidebar platform layout, and scripts/dev stale port cleanup so these post-0.1.10 failures stay covered in one place.
@sentry
Copy link
Copy Markdown

sentry Bot commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 93.75000% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...troller/src/services/skillhub/skill-dir-watcher.ts 93.75% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9bf9d0a6c6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/controller/nexu-credit-guard.test.ts Outdated
Comment thread tests/controller/runtime-stability-regressions.test.ts
lefarcen added 4 commits April 8, 2026 20:53
Cover desktop-only external link behavior so browser and local-folder launches prefer the host bridge when available and safely fall back when the controller is unavailable.
Make the credit-guard test prove cross-channel isolation before cache consumption, and add a competing runtime model to the BYOK empty-allowlist fixture so the fallback-prevention path is actually exercised.
Keep the skill ledger in sync even when fs.watch misses new skill directory events by adding a low-frequency reconciliation fallback and covering it in the desktop watcher regression test.
lefarcen added 5 commits April 9, 2026 10:50
Run core tests by default, keep extended suites available for full CI, and use a dynamic PR test selector plus desktop E2E gating so expensive jobs only run when the changed files justify them.
Make the PR test selector fetch the base branch ref on demand before computing merge-base so shallow GitHub Actions checkouts do not fail on missing origin/main in test and desktop CI jobs.
Use full checkout history in PR jobs that rely on merge-base so the dynamic test selector can compare against the base branch reliably in GitHub Actions instead of failing on shallow clones.
Fetch full git history in the CI test job so merge-base driven PR test selection can compare against the base branch reliably, matching the desktop dev workflow behavior.
Update the new regression tests to assert the stable behavior that mainline now produces: plugin ordering remains deterministic even with Feishu/Weixin defaults present, and the mac desktop layout assertion targets the expanded header offset instead of a collapsed-only toggle.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants