fix: prevent nil pointer crash in OpenAI SSE tool call accumulation#182
Open
duhd-vnpay wants to merge 18 commits intonextlevelbuilder:mainfrom
Open
fix: prevent nil pointer crash in OpenAI SSE tool call accumulation#182duhd-vnpay wants to merge 18 commits intonextlevelbuilder:mainfrom
duhd-vnpay wants to merge 18 commits intonextlevelbuilder:mainfrom
Conversation
…ing, photo handling - Fix zaloBotInfo to use account_name/display_name (not name) - Add Label() method for bot display name resolution - Handle 3 response formats in getUpdates: array, single object, wrapped - Add photo_url field to zaloMessage for Zalo CDN image URLs - Add display_name/is_bot to zaloFrom, chat_type to zaloChat - Use PhotoURL with fallback to Photo in handleImageMessage Co-Authored-By: Claude Opus 4.6 <[email protected]>
…ssing Zalo CDN URLs are auth-restricted and expire quickly, causing read_image tool failures. Now downloads photos to temp files (like Telegram channel) so the agent pipeline can base64-encode and process them normally. Falls back to passing the URL directly if download fails. Co-Authored-By: Claude Opus 4.6 <[email protected]>
… instance updates When encryption key is empty, credentials stayed as map[string]any from JSON unmarshal, causing pgx driver to fail encoding into bytea. Now credentials are always marshaled to []byte regardless of encryption. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Add Party Mode to GoClaw: structured multi-persona AI discussions with Standard (single LLM call), Deep (parallel thinking + cross-talk), and Token-Ring (sequential turns) modes. Backend: PartyStore + PG implementation, party engine with parallel goroutines, 7 RPC methods (party.start/round/question/add_context/ summary/exit/list), 10 WebSocket events, migration 000014. Frontend: React dashboard page with session list, chat view, persona sidebar, mode controls, start dialog with 6 team presets, i18n (en/vi/zh). Co-Authored-By: Claude Opus 4.6 <[email protected]>
Previously, sanitizeHistory() only cleaned the in-memory copy for each LLM request but never persisted the fix — causing the same "dropping orphaned tool message" WARN to repeat on every single request forever. Changes: - sanitizeHistory() now returns drop count alongside cleaned messages - When orphans are detected, cleaned history is persisted back to the session store via new SetHistory() method, then saved to DB - Per-message WARN logs downgraded to DEBUG (cleanup is logged once at INFO level with total count) - Added SetHistory() to SessionStore interface + both implementations Co-Authored-By: Claude Opus 4.6 <[email protected]>
When a delegated agent (e.g. ui-ux-design-agent) spawns subagents, the
announce session key uses the format delegate:{uuid8}:{agentKey}:{id}.
The scheduler's RunFunc only handled agent:{agentId}:{rest} format,
falling back to the hardcoded "default" agent — which doesn't exist in
managed-mode deployments where the default agent has a custom key.
Add delegate: prefix parsing to extract the target agent key from
position 2 of the session key parts.
Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add sender_id and channel to team task metadata for audit trail - Remove assistant prefill in team task reminder (thinking models reject it) - Add unit tests for team access control and sender_id tracking Co-Authored-By: Claude Opus 4.6 <[email protected]>
…, mobile UX, budget, traces) Resolved 13 file conflicts: - zalo.go: kept local struct tags + upstream's n==0 safety check - channel_instances.go: used upstream's credential merging (supersedes local bytea fix) - factory.go, stores.go: merged both Party + Contacts/Activity/Snapshots stores - loop_history.go: kept upstream skill inlining constants + local session persistence - session_store.go, sessions/manager.go, sessions_ops.go: deduplicated SetHistory() - sidebar.tsx, routes.tsx: added Party to upstream's restructured sidebar/routes - protocol.ts, i18n/index.ts, sidebar.json (×3): merged Party + upstream events/translations - version.go: bumped to schema version 18 Migration collision fix: renamed 000014_party_sessions → 000018_party_sessions Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add nullish coalescing for PERSONA_COLORS array index - Remove unused destructured `round` variable from PartyPage Co-Authored-By: Claude Opus 4.6 <[email protected]>
…ting Co-Authored-By: Claude Opus 4.6 <[email protected]>
…otocol Backend: - gateway_providers: read default_model from provider settings JSONB - party.go: getEngine() prefers providers with DefaultModel, alphabetical fallback for determinism (fixes random Go map iteration) - party.go: add slog.Error for round failures (was silently swallowed) Frontend: - use-party.ts: align all RPC params/events with snake_case wire format, add transformSession/mapStatus/selectSession helpers - party-start-dialog.tsx: use actual DB persona keys (morpheus-persona etc.) - party-page.tsx: use selectSession() for proper state hydration - connection-status.tsx: fix status text alignment Co-Authored-By: Claude Opus 4.6 <[email protected]>
transformSession() was dropping history/summary fields from backend response, and selectSession() reset messages to empty array. Old sessions appeared blank when clicked in sidebar. - transformSession: preserve _history and _summary from backend - hydrateMessages: new helper to convert RoundResult[] + SummaryResult into PartyMessage[] (round headers, persona messages, summary) - selectSession: call hydrateMessages() instead of setMessages([]) Co-Authored-By: Claude Opus 4.6 <[email protected]>
Root cause: `accumulators` map iterated with sequential `for i := 0; i < len(map)`
but SSE tool_call indices can be non-contiguous (e.g. {0, 2}), causing nil dereference
on `accumulators[i]` when key `i` doesn't exist.
Fixes:
1. openai.go: iterate map by sorted keys instead of sequential 0..len-1
2. lanes.go: add defer recover() in scheduler goroutine to prevent panics from
crashing the entire process — logs error and returns semaphore token
3. tracing: add SweepOrphanTraces() to mark stuck running traces as error on
gateway startup (running > 1h = orphan from previous crash)
Test results (18 tests):
- providers: 9/9 PASS (contiguous, non-contiguous, large gap, high index,
thought_signature, text-only, empty, HTTP error, cancelled context)
- scheduler: 5/5 PASS (no panic, panic recovery, multiple panics, cancelled
context, stats after panic)
- store/pg: 4/4 PASS (sweeps old running, ignores recent, ignores completed,
no orphans)
Co-Authored-By: Claude Opus 4.6 <[email protected]>
… loops When LLM hits max_tokens, tool call arguments may be incomplete/malformed JSON. - Add truncation guard: skip tool execution when finish_reason=length, ask LLM to retry smaller - Wire per-agent max_tokens from other_config JSONB (default 8192) - Log warning on JSON parse failure for non-empty tool call arguments (truncation indicator) Co-Authored-By: Claude Opus 4.6 <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
openai.go—accumulatorsmap (keyed by SSE tool_call index) was iterated withfor i := 0; i < len(map); i++, assuming contiguous keys 0,1,2... SSE indices from extended thinking models can be non-contiguous (e.g.{0, 2}), causingaccumulators[i]→ nil → panic onacc.rawArgslanes.go— adddefer recover()in scheduler lane goroutine so panics log error + return semaphore token instead of crashing the processtracing.go+gateway.go— addSweepOrphanTraces()to mark stuckrunningtraces aserroron startup (traces older than 1h with statusrunning= orphan from crash)Changed files
internal/providers/openai.gointernal/scheduler/lanes.gorecover()in goroutine deferinternal/store/tracing_store.goSweepOrphanTracesto interfaceinternal/store/pg/tracing.goSweepOrphanTraces(UPDATE running→error)cmd/gateway.goTest results (18 tests, all PASS)
providers (9/9 PASS):
ContiguousToolCalls— indices 0,1 (happy path)NonContiguousToolCallIndices— indices 0,2 skip 1 (the crash scenario)LargeGapToolCallIndices— indices 0,5,10SingleToolCallAtHighIndex— single call at index 3ToolCallWithThoughtSignature— metadata preservedTextOnly/EmptyToolCalls— no tool callsHTTPError— 429 → proper HTTPErrorCancelledContext— context cancel → errorscheduler (5/5 PASS):
NoPanic— normal operationPanicRecovery— panic → lane recovers, accepts new workMultiplePanics— 3 panics → all semaphore tokens returnedCancelledContext— cancelled ctx with full lane → errorStatsAfterPanic— active counter decremented after panicstore/pg (4/4 PASS, integration with PostgreSQL):
SweepsOldRunning— running trace 2h old → sweptIgnoresRecentRunning— running trace 5min old → NOT sweptIgnoresCompleted— completed trace → NOT sweptNoOrphans— no running traces → swept=0🤖 Generated with Claude Code