Skip to content

integrate management hardening and executor usage fixes#2234

Open
shenshuoyaoyouguang wants to merge 15 commits intorouter-for-me:mainfrom
shenshuoyaoyouguang:merge-pr-1926-1923
Open

integrate management hardening and executor usage fixes#2234
shenshuoyaoyouguang wants to merge 15 commits intorouter-for-me:mainfrom
shenshuoyaoyouguang:merge-pr-1926-1923

Conversation

@shenshuoyaoyouguang
Copy link
Contributor

@shenshuoyaoyouguang shenshuoyaoyouguang commented Mar 19, 2026

Summary

Merge notes

Test plan

  • go test ./internal/runtime/executor
  • go test ./internal/api/handlers/management ./internal/api ./internal/config
  • go test ./test -run "TestDeleteAmpModelMappings_(EmptyBody|InvalidJSON|MissingValue|NullValue|EmptyArrayClearsAll|Specific|VerifyState|NonExistent)"
  • go test ./internal/api/handlers/management ./internal/api ./internal/config ./internal/runtime/executor ./test

🤖 Generated with Claude Code

chujian and others added 12 commits March 7, 2026 16:54
…d state access

- remove request-wide StateMiddleware locking from management routes
- add runtime snapshots and applyConfigMutation for config writes
- migrate management config/auth/log/usage paths to short-lived state access
- fix oversized logs-max-total-size-mb acceptance in PUT /config.yaml
- add regression coverage for lock-free config updates and oauth session registration
Finish merging the management hardening changes on top of PR router-for-me#1926 and prevent malformed amp mapping delete requests from clearing all mappings.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@shenshuoyaoyouguang shenshuoyaoyouguang changed the title merge management hardening and executor usage fixes integrate management hardening and executor usage fixes Mar 19, 2026
@shenshuoyaoyouguang
Copy link
Contributor Author

@codex review
@gemini review

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates several critical improvements to the system's stability and functionality. It refactors the management API to handle configuration state more robustly, ensuring data consistency and preventing concurrency issues. Additionally, it enhances the accuracy and reliability of usage reporting across various model executors, particularly for streaming responses. A specific endpoint for managing model mappings has also been made more resilient to malformed requests, improving overall API robustness.

Highlights

  • Management Hardening & Configuration Limits: Merged changes from PR fix(management): harden state updates and config limits #1923, introducing a new concurrency-safe state management mechanism for the management API handlers. This includes using snapshots for reading configuration and atomic mutations for writing, preventing race conditions and ensuring consistency. Configuration limits, such as logs-max-total-size-mb, are now enforced.
  • Executor Usage & include_usage Fixes: Merged changes from PR fix(executor): track usage consistently and harden openai compat streaming #1926, enhancing usage reporting for streaming executors (Claude, Gemini, Kimi, Qwen) by ensuring a usage record is published even if no explicit usage chunk is received. For OpenAI-compatible executors, stream_options.include_usage is now automatically injected to ensure usage data is captured, with a retry mechanism if the upstream API rejects this option.
  • Robust DELETE /v0/management/ampcode/model-mappings: Implemented a follow-up fix to prevent malformed DELETE /v0/management/ampcode/model-mappings requests from inadvertently clearing all model mappings. Requests now require an explicit {"value":[]} to clear all mappings, and invalid JSON or missing values are properly rejected.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist
Copy link
Contributor

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e851cba048

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@gemini-code-assist
Copy link
Contributor

Warning

Gemini is experiencing higher than usual traffic and was unable to create the review. Please try again in a few hours by commenting /gemini review.

Copy link

@xkonjin xkonjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Large PR covering three concerns: (1) management handler hardening via runtimeSnapshot(), (2) executor usage tracking fixes, and (3) defensive DELETE endpoint for model mappings.

Positives

  • Replacing direct h.authManager / h.cfg field access with runtimeSnapshot() is a significant reliability improvement — eliminates data races during config reloads
  • The findAuthByName extraction reduces duplicated lookup logic across 4+ handlers
  • DELETE model-mappings hardening is excellent: empty body, null value, invalid JSON, and missing value all properly rejected. Only {"value":[]} clears mappings.
  • ensurePublished on Qwen executor stream close prevents usage under-reporting
  • Stream usage parser now handles both prompt_tokens/input_tokens naming conventions

Issues to consider

1. tokenStoreWithBaseDir has a TOCTOU race

store := snapshot.tokenStore
if store == nil {
    store = sdkAuth.GetTokenStore()
    h.stateMu.Lock()
    if h.tokenStore == nil {
        h.tokenStore = store
    }
    h.stateMu.Unlock()
}

The snapshot read and the subsequent mutex-guarded write are not atomic. Two goroutines could both read snapshot.tokenStore == nil, both call sdkAuth.GetTokenStore(), and the second one's store gets discarded. If GetTokenStore() is idempotent and returns a singleton, this is fine. If it creates a new instance each time, you could end up with a discarded store that already has state. Worth verifying.

2. managementCallbackURL dereferences h before nil check

snapshot, err := h.runtimeSnapshot()
if h == nil || err != nil || snapshot.cfg == nil || snapshot.cfg.Port <= 0 {

If h is nil, h.runtimeSnapshot() panics before the nil check runs. The original code had the same issue (h.cfg), but this is an opportunity to fix it: check h == nil first.

3. Usage parsing simplification removes intentional zero-value distinction
The old code:

if cached.Exists() {
    detail.CachedTokens = cached.Int()
}

Now becomes:

detail.CachedTokens = cached.Int()

When the field does not exist, gjson returns 0. The old code left the field at its zero value too, so functionally equivalent. But if the struct ever gains pointer fields or the zero value becomes meaningful (e.g., "0 cached tokens" vs "caching not reported"), this distinction matters. Low risk but worth noting.

4. No test for the Qwen ensurePublished addition
The new ensurePublished call on Qwen stream close is important for usage accuracy, but there is no corresponding test. Consider adding a test that verifies usage is recorded even when no explicit usage chunk appears in the stream.

Well-structured hardening PR. The snapshot migration is the right pattern. The DELETE endpoint fixes prevent a class of accidental data loss. Worth a quick check on the TOCTOU in token store before merge.

Copy link
Contributor Author

@shenshuoyaoyouguang shenshuoyaoyouguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me after the latest fixes.

Notes

  • The OAuth callback routes now fail closed instead of returning a false success page when callback persistence fails.
  • PUT /config.yaml now normalizes logs-max-total-size-mb before persistence, so the runtime config and the saved file stay consistent.
  • Added regression coverage for the Qwen streaming fallback usage path when no usage chunk is emitted.

Verified

  • go test ./internal/api
  • go test ./internal/api/handlers/management
  • go test ./internal/runtime/executor

Non-blocking follow-ups

  • state == "" on callback routes still falls through to the success-page path (pre-existing behavior).
  • PUT /config.yaml still uses io.ReadAll without a request size limit.

Approve with notes.

Fail OAuth callback routes closed when callback persistence fails, persist normalized config.yaml values so runtime and saved config stay consistent, and add regression coverage for the Qwen fallback usage path.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a20c2fcebf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Reconnect management config writes to the server runtime apply path so successful updates now mean disk state and live behavior have switched together. Also prune removed ampcode keys during YAML-preserving saves so clearing upstream API key mappings persists across reload.

Constraint: Management config writes must reuse Server.UpdateClients as the single runtime side-effect entry point
Constraint: Successful responses must only be returned after persisted config has been reloaded and applied
Rejected: Apply the mutated in-memory clone directly | would bypass reload-time normalization and persisted-state semantics
Rejected: Reimplement hot-reload side effects per endpoint | would duplicate and drift from UpdateClients behavior
Directive: Keep management config commits flowing through commitConfig/reloadCommittedConfig and Server.UpdateClients
Confidence: high
Scope-risk: moderate

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@shenshuoyaoyouguang
Copy link
Contributor Author

@gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to the management API's robustness and thread safety by refactoring state management to use a snapshot-based approach. It also integrates fixes for executor usage reporting and include_usage handling in streaming requests. A critical bug that could lead to accidental deletion of all model mappings has been fixed and is now well-tested. The code changes are extensive but consistent, and the addition of targeted tests for the new logic is commendable. My review includes a suggestion to improve the specificity of error messages in list deletion handlers for better diagnostics.

Comment on lines +134 to +136
if idx < 0 || idx >= len(*items) {
return fmt.Errorf("missing index or value")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error message "missing index or value" is a bit generic for an out-of-bounds index error. Providing a more specific message would improve diagnostics for API clients. This pattern of generic error messages for index-based deletions appears in several other handlers in this file (e.g., DeleteGeminiKey, DeleteClaudeKey).

Suggested change
if idx < 0 || idx >= len(*items) {
return fmt.Errorf("missing index or value")
}
if idx < 0 || idx >= len(*items) {
return fmt.Errorf("index %d is out of bounds for list of length %d", idx, len(*items))
}

@shenshuoyaoyouguang
Copy link
Contributor Author

@xkonjin review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb61b6f631

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +394 to +397
if err := writePendingOAuthCallbackFile(provider, state, code, errStr); err != nil {
log.Errorf("persist %s oauth callback failed: %v", provider, err)
c.Header("Content-Type", "text/html; charset=utf-8")
c.String(http.StatusInternalServerError, oauthCallbackFailureHTML)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mark failed OAuth callbacks as errored before returning 500

If persisting the callback file fails here (for example because the auth dir is missing, read-only, or out of space), the browser gets a failure page but the session stays pending. The login goroutines started in RequestAnthropicToken/RequestCodexToken/RequestGeminiCLIToken only stop when IsOAuthSessionPending becomes false, so they keep waiting until the timeout and the UI keeps reporting an in-progress auth even though the callback already failed. Setting the session error before returning would let those flows fail immediately.

Useful? React with 👍 / 👎.


pruneMappingToGeneratedKeys(original.Content[0], generated.Content[0], "oauth-excluded-models")
pruneMappingToGeneratedKeys(original.Content[0], generated.Content[0], "oauth-model-alias")
pruneAmpCodeGeneratedKeys(original.Content[0], generated.Content[0])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve unrecognized ampcode keys when rewriting config

This new prune step removes every child under ampcode that is not present in the current AmpCode struct before merging the generated YAML back. As a result, any forward-compatible or vendor-specific ampcode setting that this binary does not know about will be silently deleted the next time a management endpoint saves the config, even if the edit was unrelated. SaveConfigPreserveComments is otherwise designed to preserve unknown keys, so this introduces data loss specifically for shared/multi-version configs that carry extra ampcode fields.

Useful? React with 👍 / 👎.

Mark OAuth sessions as failed when callback persistence breaks so pending auth flows stop waiting immediately, and narrow amp config pruning to the managed upstream-api-keys field so unknown ampcode settings survive config rewrites.

Constraint: OAuth callback failures must transition sessions out of pending without changing successful callback behavior
Constraint: YAML-preserving saves must clear removed upstream-api-keys without deleting unknown ampcode extensions
Rejected: Leave callback persistence failures as pending | keeps auth polling and UI state stuck until timeout
Rejected: Prune the full ampcode mapping | would silently drop forward-compatible or vendor-specific keys
Directive: Keep ampcode pruning targeted to explicitly managed keys unless broader schema-aware pruning is introduced
Confidence: high
Scope-risk: narrow

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a49b9ef60b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +527 to +529
if err := mutate(nextCfg); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return false

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve 404s when config-list items are missing

applyConfigMutation now turns every mutation error into 400 Bad Request. Several list handlers in this commit use fmt.Errorf("item not found") for missing resources—for example PatchCodexKey in internal/api/handlers/management/config_lists.go used to return 404 before this refactor. As a result, a request that targets a non-existent entry is now indistinguishable from a malformed payload, which breaks clients/UI flows that rely on 404 Not Found to detect stale selections and refresh state.

Useful? React with 👍 / 👎.

Comment on lines +49 to +50
if err != nil {
managementHandlers.SetOAuthSessionError(state, strings.TrimSpace(errStr))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Don't fail the active OAuth session on wrong callback route

This helper marks the session errored for every WriteOAuthCallbackFileForPendingSession failure, including provider mismatches. If an Anthropic login is pending and the browser lands on /codex/callback?state=... (for example via a stale redirect URL or manual retry), WriteOAuthCallbackFileForPendingSession returns errOAuthSessionNotPending, then SetOAuthSessionError flips the real session out of pending. The wait loops in RequestAnthropicToken/RequestCodexToken exit as soon as IsOAuthSessionPending becomes false, so the legitimate login is aborted even though the real callback never failed.

Useful? React with 👍 / 👎.

@shenshuoyaoyouguang
Copy link
Contributor Author

@gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a major refactoring of the management API handlers to improve thread safety and configuration management. The new approach uses state snapshots and a transactional configuration mutation workflow, which is a significant improvement in robustness. The PR also includes fixes for executor usage reporting and enhances compatibility with OpenAI-compatible providers by adding intelligent retry logic for include_usage stream options. Additionally, it hardens the DELETE /v0/management/ampcode/model-mappings endpoint to prevent accidental data loss from malformed requests. The changes are well-implemented and accompanied by thorough tests. I have one minor suggestion to improve API error messages.

Value *[]string `json:"value"`
}
if err := c.ShouldBindJSON(&body); err != nil || body.Value == nil {
c.JSON(400, gin.H{"error": "invalid body"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error message "invalid body" is a bit generic. It could be more specific to help API consumers debug their requests, especially since this check handles both malformed JSON and a missing/null value field.

Suggested change
c.JSON(400, gin.H{"error": "invalid body"})
c.JSON(400, gin.H{"error": "invalid body: 'value' field is missing, null, or malformed"})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants