Skip to content

fix(kthena-router): remove gomonkey to fix test crashes on Go 1.24/ARM64#875

Open
xrwang8 wants to merge 5 commits intovolcano-sh:mainfrom
xrwang8:fix/kthena-router-test-stability
Open

fix(kthena-router): remove gomonkey to fix test crashes on Go 1.24/ARM64#875
xrwang8 wants to merge 5 commits intovolcano-sh:mainfrom
xrwang8:fix/kthena-router-test-stability

Conversation

@xrwang8
Copy link
Copy Markdown
Contributor

@xrwang8 xrwang8 commented Apr 9, 2026

What type of PR is this?

/kind bug

What this PR does / why we need it:

Removes gomonkey runtime patching from pkg/kthena-router tests which
caused instability on Go 1.24.1 + darwin/arm64.

The fix adds explicit dependency injection seams:

  • PodRuntimeInspector interface in datastore package for
    backend.GetPodMetrics/GetPodModels
  • Functional dependency fields in Router for request building, streaming
    detection, and proxy

Tests now use fakes injected via WithPodRuntimeInspector() and
newRouterWithDeps() instead of runtime binary patching.

Which issue(s) this PR fixes:
Fixes #872

Special notes for your reviewer:

  • No production behavior changes - default implementations use real backends
  • New() now accepts variadic ...Option but remains backward compatible
  • All test assertions unchanged, only mock mechanism replaced

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes gomonkey runtime patching from pkg/kthena-router tests, which was causing test instability on Go 1.24.1 and darwin/arm64 environments. The fix introduces explicit dependency injection instead of runtime binary patching.

Changes:

  • Added PodRuntimeInspector interface in datastore/store.go for pod metrics and model retrieval with a default implementation delegating to backend functions
  • Introduced functional dependency fields to Router struct for request building, streaming detection, and proxy operations, with a routerDeps struct for parameter passing
  • Refactored tests to use dependency injection with fake implementations instead of gomonkey patches
  • Updated datastore.New() to accept variadic Option parameters for dependency injection while maintaining backward compatibility

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pkg/kthena-router/router/router.go Added buildDecodeRequest, isStreaming, and proxyRequest functional fields to Router, split NewRouter into NewRouter (public) and newRouterWithDeps (internal), updated methods to use injected dependencies
pkg/kthena-router/datastore/store.go Added PodRuntimeInspector interface, realPodRuntimeInspector implementation, Option pattern support, and updated pod metrics/models calls to use injected inspector
pkg/kthena-router/router/router_test.go Removed gomonkey imports and patches, added fakeScheduler, created test helpers (mustLoadTestRouterConfig, newTestRouterWithDeps), refactored TestProxyModelEndpoint to use dependency injection, updated setupTestRouter to accept *testing.T
pkg/kthena-router/datastore/store_test.go Removed gomonkey imports and patches, added fakePodRuntimeInspector struct with call tracking, created newStore helper supporting optional inspector parameter, replaced gomonkey patches with inspector callbacks
pkg/kthena-router/datastore/ordering_test.go Removed gomonkey imports and patches, created newStoreWithMockBackend helper using WithPodRuntimeInspector option, refactored all tests to use the new helper
pkg/kthena-router/controller/modelserver_controller_test.go Removed gomonkey imports and patches, added local fakePodRuntimeInspector, created newStoreWithMockBackend helper, updated all test functions to use dependency injection

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the codebase to improve testability by replacing gomonkey-based mocking with dependency injection and interfaces. It introduces the PodRuntimeInspector interface in the datastore and a routerDeps structure for the router, allowing for cleaner unit tests. Feedback was provided regarding a potential issue in updatePodMetrics where a nil return from the metrics inspector could inadvertently reset pod metrics to zero, potentially impacting scheduling decisions.

@xrwang8 xrwang8 force-pushed the fix/kthena-router-test-stability branch from 494ba42 to afb55cc Compare April 9, 2026 05:27
Copilot AI review requested due to automatic review settings April 9, 2026 05:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings April 10, 2026 07:17
@volcano-sh-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hzxuzhonghu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@xrwang8 xrwang8 force-pushed the fix/kthena-router-test-stability branch from 65f09ae to 99cf773 Compare April 10, 2026 07:43
xrwang8 added 5 commits April 10, 2026 17:32
Add explicit datastore and router seams so kthena-router tests can use fakes instead of runtime monkey patching.

Rewrite datastore, router, and controller tests to inject dependencies directly while keeping default production behavior unchanged.

Signed-off-by: xrwang8 <[email protected]>
Run gofmt to fix field alignment in fakePodRuntimeInspector struct.

Signed-off-by: xrwang8 <[email protected]>
Address code review feedback: if GetPodMetrics returns nil (unsupported
inference engine or transient error), skip updating metrics instead of
incorrectly resetting them to zero.

Signed-off-by: xrwang8 <[email protected]>
This commit addresses test failures on macOS and Go 1.24/ARM64 by removing
gomonkey and hacky dependency injection.

Changes to model-serving-controller:
- Add PodGroupManager interface for mock implementation
- Add test hooks (function fields) to ModelServingController:
  - enqueueModelServingFunc
  - enqueueModelServingAfterFunc
  - getModelServingAndResourceDetailsFunc
  - shouldSkipHandlingFunc
  - handleDeletionInProgressFunc
  - deleteRoleFunc
- Create fakePodGroupManager for tests
- Update tests to use dependency injection instead of gomonkey
- Remove gomonkey from go.mod

Changes to kthena-router:
- Revert router.go to match upstream community version
- Remove dependency injection fields (buildDecodeRequest, isStreaming, proxyRequest)
- Use mock HTTP server (httptest.NewServer) for testing

Signed-off-by: xrwang8 <[email protected]>
Remove license files for gomonkey and procfs dependencies
that were removed from the project.

Signed-off-by: xrwang8 <[email protected]>
@xrwang8
Copy link
Copy Markdown
Contributor Author

xrwang8 commented Apr 13, 2026

@hzxuzhonghu PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pkg/kthena-router tests are unstable on Go 1.24.1 darwin/arm64 due to gomonkey-based patching

4 participants