chore: add runtime and desktop regression coverage#941
Open
chore: add runtime and desktop regression coverage#941
Conversation
Add regression tests for runtime stability fixes, launchd disabled-override recovery, credit-guard channel isolation, mac desktop sidebar platform layout, and scripts/dev stale port cleanup so these post-0.1.10 failures stay covered in one place.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9bf9d0a6c6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Cover desktop-only external link behavior so browser and local-folder launches prefer the host bridge when available and safely fall back when the controller is unavailable.
Make the credit-guard test prove cross-channel isolation before cache consumption, and add a competing runtime model to the BYOK empty-allowlist fixture so the fallback-prevention path is actually exercised.
Keep the skill ledger in sync even when fs.watch misses new skill directory events by adding a low-frequency reconciliation fallback and covering it in the desktop watcher regression test.
mrcfps
approved these changes
Apr 8, 2026
Run core tests by default, keep extended suites available for full CI, and use a dynamic PR test selector plus desktop E2E gating so expensive jobs only run when the changed files justify them.
Make the PR test selector fetch the base branch ref on demand before computing merge-base so shallow GitHub Actions checkouts do not fail on missing origin/main in test and desktop CI jobs.
Use full checkout history in PR jobs that rely on merge-base so the dynamic test selector can compare against the base branch reliably in GitHub Actions instead of failing on shallow clones.
Fetch full git history in the CI test job so merge-base driven PR test selection can compare against the base branch reliably, matching the desktop dev workflow behavior.
Update the new regression tests to assert the stable behavior that mainline now produces: plugin ordering remains deterministic even with Feishu/Weixin defaults present, and the mac desktop layout assertion targets the expanded header offset instead of a collapsed-only toggle.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Add a focused batch of regression coverage for the highest-risk runtime and desktop flows that changed after
v0.1.10, and optimize CI so local/default test runs stay fast while PR/CI still execute the right coverage for the files that changed.Why
Recent post-release fixes touched restart prevention, BYOK model preservation, launchd disabled overrides, credit-guard error replacement, mac desktop layout, local dev stale-port recovery, and skill watcher reconciliation. Those fixes are valuable precisely because they protect against subtle regressions, so they need direct tests rather than relying on indirect coverage. At the same time, the test suite has grown enough that we should stop running the slowest desktop/e2e paths on every PR when the changed files do not justify them.
How
plugins.allowordering and BYOK model preservation when a configured provider has an empty model allowlist.nexu-credit-guardchannel isolation and stale-cache eviction.scripts/devservice tests covering stale web/controller port cleanup before startup.SkillDirWatcherwith a low-frequency reconciliation fallback so missed fs events do not leave the ledger stale.coreandextendedlayers, add a PR-aware test-selection script, and gate Desktop E2E so expensive suites/coverage only run for relevant changes while push/main still keeps full coverage.Affected areas
Checklist
pnpm typecheckpassespnpm lintpassespnpm testpassespnpm generate-typesrun (if API routes/schemas changed)anytypes introduced (useunknownwith narrowing)