Skip to content

chore(ai): explicit model policy — router/chat/program/summarize + preview-vs-stable #216

@JeffOtano

Description

@JeffOtano

Context

`convex/ai/providers.ts:23-93` ships one generic primary/fallback pair per provider. Two problems:

  1. No tier-by-task policy. Current: everything hits Sonnet 4.6 / Gemini 3 Flash / GPT-5.4 primary. World-class split: cheap for routing/classification/summarization, mid for chat, premium for planning/programming.

  2. Preview models in production. `primaryModel: "gemini-3-flash-preview"` — we ship a preview model as the default to our Gemini BYOK majority. One silent API change from Google breaks the whole product.

Files

  • `convex/ai/providers.ts:23-93` — current config
  • `convex/ai/coach.ts:192-249` — `buildCoachAgents` / `buildCoachAgentsForProvider` consumers
  • New (probably): `docs/ai/model-policy.md` or an ADR

Acceptance

  • Write a short policy doc: which model runs which task tier across all 4 providers
  • Tiers: `router` (cheapest — Haiku 4.5 / Gemini Flash-lite / GPT-5.4-nano / OpenRouter auto), `chat` (mid — Sonnet 4.6 / Gemini Pro / GPT-5.4-mini), `programming` (premium — Opus 4.6 / Gemini Pro + high effort / GPT-5.4), `summarize` (router tier)
  • Preview-vs-stable rule: `*-preview` models MUST NOT be the default primary. Can only be opt-in via user setting or feature flag
  • Move `gemini-3-flash-preview` off the default path; replace with stable `gemini-3-flash` (or latest stable)
  • Wire `prepareStep` to swap models by tier (pairs with perf(ai): adaptive thinking via prepareStep (effort=high on program_week) #205, perf(ai): prepareStep dynamic tool subset per agent step #210)
  • Document per-tier cost in the policy doc for future reference

References

Related: #190 (intent routing is one consumer of this policy), #205 (effort per step), #210 (prepareStep tool subset)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ai-agentWork on the AI coach agent systemperformanceCost or latency improvementphase: architectureAI architecture reshape phasepriority: highHigh priority workreliabilityUptime, failover, circuit breakers

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions