Skip to content

feat(openai): add per-capability default model config via Builder#1925

Open
Buktal wants to merge 3 commits into
agentscope-ai:mainfrom
Buktal:feat/multimodal-tool-builder
Open

feat(openai): add per-capability default model config via Builder#1925
Buktal wants to merge 3 commits into
agentscope-ai:mainfrom
Buktal:feat/multimodal-tool-builder

Conversation

@Buktal

@Buktal Buktal commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

AgentScope-Java Version

2.0.0-SNAPSHOT

Description

1. Per-capability default model via Builder

OpenAIMultiModalTool previously had a single defaultModelName field that only applied to openai_image_to_text. The other three tools had their default models hardcoded.

This adds a Builder matching OpenAIChatModel.Builder conventions, exposing per-capability defaults:

  • defaultVisionModel — image-to-text (falls back to gpt-4o)
  • defaultImageGenModel — text-to-image (falls back to dall-e-3)
  • defaultTtsModel — text-to-audio (falls back to tts-1)
  • defaultSttModel — audio-to-text (falls back to whisper-1)
  • baseUrl — custom endpoint (optional)

When not set via Builder, the existing hardcoded values are used — no change to current behavior.

2. Flaky test fix

SubAgentToolTimeoutRetryIntegrationTest.oldAgentIsStopped was flaky: when the timeout cancelled the subscription, boundedElastic interrupted the sleeping thread via Future.cancel(true). The old code wrapped both Thread.sleep and Files.writeString in one try-catch, so the interrupt silently skipped the write, making the assertion fail. Fixed by splitting them.

Related:

Checklist

  • Code has been formatted with mvn spotless:apply
  • All tests are passing (mvn test)
  • Javadoc comments are complete and follow project conventions
  • Related documentation has been updated
  • Code is ready for review

Replace the single defaultModelName field with per-capability defaults
(defaultVisionModel, defaultImageGenModel, defaultTtsModel, defaultSttModel)
via a new Builder, matching OpenAIChatModel.Builder conventions.

Existing constructors remain for backward compatibility.
Default values unchanged: gpt-4o, dall-e-3, tts-1, whisper-1.

Closes agentscope-ai#1924
@Buktal Buktal requested a review from a team June 26, 2026 02:04
Buktal added 2 commits June 26, 2026 10:11
When the outer timeout cancels the subscription, boundedElastic interrupts
the sleeping thread via Future.cancel(true). The old code wrapped both the
sleep and the file write in one try-catch, so the interrupt silently skipped
the write. Split them so the write always executes even if sleep is interrupted.

Closes agentscope-ai#1924 (part of PR agentscope-ai#1925)
@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 76.59574% with 11 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...nsions/model/openai/tool/OpenAIMultiModalTool.java 76.59% 9 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@itxaiohanglover

Copy link
Copy Markdown
Contributor

Nice work on the Builder — backward compat is well handled with the old constructor delegating to the new private one. One thing I noticed: the builder tests only do assertNotNull(tool), so it'd be good to also verify the configured models actually get used in API calls (like the existing testImageToText_usesCustomDefaultModel does). Also, the InterruptedException fix in SubAgentToolTimeoutRetryIntegrationTest seems unrelated to this PR — might be worth a separate commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Support per-capability default model configuration in OpenAIMultiModalTool

2 participants