Skip to content

feat(design): multi-provider support — GPT Image 1.5 + Gemini#946

Open
lubos-buracinsky wants to merge 2 commits intogarrytan:mainfrom
lubos-buracinsky:feat/multi-provider-design
Open

feat(design): multi-provider support — GPT Image 1.5 + Gemini#946
lubos-buracinsky wants to merge 2 commits intogarrytan:mainfrom
lubos-buracinsky:feat/multi-provider-design

Conversation

@lubos-buracinsky
Copy link
Copy Markdown

Summary

  • Upgrade OpenAI image generation from gpt-4o Responses API to GPT Image 1.5 (gpt-image-1) via Images API. Better text rendering for UI mockups.
  • Add Gemini provider (Nano Banana 2 / Gemini 3.1 Flash Image) as alternative. Faster, cheaper.
  • Auto-detect provider from available API keys (OPENAI_API_KEY or GEMINI_API_KEY).
  • New config: ~/.gstack/design.json with explicit provider selection.
  • Fully backwards compatible: existing ~/.gstack/openai.json still works.

Why

The current $D generate uses gpt-4o with the image_generation tool, which routes to DALL-E 3 under the hood. GPT Image 1.5 is a newer, dedicated image model with significantly better text rendering (ELO 1264 on text benchmarks). For UI mockups with text, buttons, and layout, this is a meaningful quality upgrade.

Gemini support opens the door for users who already have a Google API key (common in the ecosystem) but not an OpenAI key.

Changes

  • design/src/auth.ts — multi-provider config resolution (design.json > env vars > legacy openai.json)
  • design/src/generate.ts — provider routing: OpenAI Images API + Gemini generateContent API

Test plan

  • Build succeeds (bun build --compile design/src/cli.ts)
  • OpenAI GPT Image 1.5 generates mockup
  • Gemini generates mockup
  • Legacy ~/.gstack/openai.json still works
  • GEMINI_API_KEY env var auto-detects Gemini provider

🤖 Generated with Claude Code

lubos-buracinsky and others added 2 commits April 9, 2026 23:15
- Upgrade OpenAI from gpt-4o Responses API to GPT Image 1.5 (gpt-image-1)
  via Images API. Better text rendering, dedicated image model.
- Add Gemini provider (Nano Banana 2 / Gemini 3.1 Flash Image) as
  alternative. Faster, cheaper, good quality.
- Auto-detect provider from available API keys.
- New config: ~/.gstack/design.json with provider selection.
- Backwards compatible: existing ~/.gstack/openai.json still works.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove response_format (not supported by gpt-image-1)
- Use output_format: "png" (GPT Image always returns base64)
- Parse both b64_json and b64 response fields
- Add GOOGLE_AI_API_KEY as fallback for Gemini provider detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant