Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,8 +120,8 @@ opencode run "Hello" --model=google/antigravity-claude-opus-4-6-thinking --varia
| `antigravity-gemini-3-pro` | low, high | Gemini 3 Pro with thinking |
| `antigravity-gemini-3.1-pro` | low, high | Gemini 3.1 Pro with thinking (rollout-dependent) |
| `antigravity-gemini-3-flash` | minimal, low, medium, high | Gemini 3 Flash with thinking |
| `antigravity-claude-sonnet-4-6` | — | Claude Sonnet 4.6 |
| `antigravity-claude-opus-4-6-thinking` | low, max | Claude Opus 4.6 with extended thinking |
| `antigravity-claude-sonnet-4-6` | — | Claude Sonnet 4.6 (200k base context) |
| `antigravity-claude-opus-4-6-thinking` | low, max | Claude Opus 4.6 with extended thinking (200k base context) |

**Gemini CLI quota** (separate from Antigravity; used when `cli_first` is true or as fallback):

Expand All @@ -140,6 +140,7 @@ opencode run "Hello" --model=google/antigravity-claude-opus-4-6-thinking --varia
> - When a Gemini quota pool is exhausted, the plugin automatically falls back to the other pool.
> - Claude and image models always use Antigravity.
> Model names are automatically transformed for the target API (e.g., `antigravity-gemini-3-flash` → `gemini-3-flash-preview` for CLI).
> - Claude Antigravity limits are 200k by default. Experimental opt-in long-context beta header attempts are available via `antigravity.json` (`claude_long_context_beta`) and automatically fall back to stable 200k if rejected.

**Using variants:**
```bash
Expand Down Expand Up @@ -190,12 +191,12 @@ Add this to your `~/.config/opencode/opencode.json`:
}
},
"antigravity-claude-sonnet-4-6": {
"name": "Claude Sonnet 4.6 (Antigravity)",
"name": "Claude Sonnet 4.6 (Antigravity, 200k base)",
"limit": { "context": 200000, "output": 64000 },
"modalities": { "input": ["text", "image", "pdf"], "output": ["text"] }
},
"antigravity-claude-opus-4-6-thinking": {
"name": "Claude Opus 4.6 Thinking (Antigravity)",
"name": "Claude Opus 4.6 Thinking (Antigravity, 200k base)",
"limit": { "context": 200000, "output": 64000 },
"modalities": { "input": ["text", "image", "pdf"], "output": ["text"] },
"variants": {
Expand Down
11 changes: 11 additions & 0 deletions assets/antigravity.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,17 @@
"type": "boolean",
"description": "Enable Claude prompt auto-caching by adding top-level cache_control when absent."
},
"claude_long_context_beta": {
"default": false,
"type": "boolean",
"description": "Enable experimental Claude long-context beta header injection for Claude 4.6 models. If rejected, requests auto-fallback to stable 200k behavior."
},
"claude_long_context_beta_header": {
"default": "context-1m-2025-08-07",
"type": "string",
"minLength": 1,
"description": "Claude long-context beta header value. Override if provider beta token changes."
},
"proactive_token_refresh": {
"default": true,
"type": "boolean",
Expand Down
21 changes: 21 additions & 0 deletions docs/CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ Settings that affect how the model thinks and responds.
| Option | Default | Description |
|--------|---------|-------------|
| `keep_thinking` | `false` | Preserve Claude's thinking blocks across turns. **Warning:** enabling may degrade model stability. |
| `claude_long_context_beta` | `false` | Experimental: attempt Claude 4.6 long-context beta header (provider entitlement required) |
| `claude_long_context_beta_header` | `"context-1m-2025-08-07"` | Beta header value used when `claude_long_context_beta` is enabled |
| `session_recovery` | `true` | Auto-recover from tool_result_missing errors |
| `auto_resume` | `false` | Auto-send resume prompt after recovery |
| `resume_text` | `"continue"` | Text to send when auto-resuming |
Expand All @@ -51,6 +53,24 @@ When `false` (default), thinking is stripped:
- **Pros:** More stable model behavior, smaller context
- **Cons:** Model may be less coherent, forgets previous reasoning

### About `claude_long_context_beta`

Claude models on the Antigravity path remain **200k context by default**.

When `claude_long_context_beta` is enabled:
- Claude 4.6 requests include the configured `anthropic-beta` token (`claude_long_context_beta_header`)
- If the provider rejects that beta header, the plugin retries once automatically without it
- The request falls back to the stable 200k path and logs the rejection reason

Example:

```json
{
"claude_long_context_beta": true,
"claude_long_context_beta_header": "context-1m-2025-08-07"
}
```

---

## Account Rotation
Expand Down Expand Up @@ -168,6 +188,7 @@ These settings are `false` by default:
| Setting | Default | What it does |
|---------|---------|--------------|
| `keep_thinking` | `false` | Preserve Claude thinking (may degrade stability) |
| `claude_long_context_beta` | `false` | Opt-in experimental Claude 1M beta header attempt |
| `auto_resume` | `false` | Auto-continue after recovery |

---
Expand Down
5 changes: 4 additions & 1 deletion docs/MODEL-VARIANTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,12 +104,15 @@ Claude models use token-based thinking budgets:
| `low` | 8192 | Light thinking |
| `max` | 32768 | Maximum thinking |

> **Context Limit Note:** Claude models on Antigravity are configured with a **200k base context**.
> If you enable `claude_long_context_beta` in `antigravity.json`, the plugin can attempt an experimental long-context beta header and automatically falls back to 200k if rejected.

### Claude Example

```json
{
"antigravity-claude-opus-4-6-thinking": {
"name": "Claude Opus 4.6 Thinking (Antigravity)",
"name": "Claude Opus 4.6 Thinking (Antigravity, 200k base)",
"limit": { "context": 200000, "output": 64000 },
"modalities": { "input": ["text", "image", "pdf"], "output": ["text"] },
"variants": {
Expand Down
4 changes: 4 additions & 0 deletions script/build-schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@ const optionDescriptions: Record<string, string> = {
"Enable tool hallucination prevention for Claude models. Injects parameter signatures and strict usage rules.",
claude_prompt_auto_caching:
"Enable Claude prompt auto-caching by adding top-level cache_control when absent.",
claude_long_context_beta:
"Enable experimental Claude long-context beta header injection for Claude 4.6 models. If rejected, requests auto-fallback to stable 200k behavior.",
claude_long_context_beta_header:
"Claude long-context beta header value. Override if provider beta token changes.",
proactive_token_refresh:
"Enable proactive background token refresh before expiry, ensuring requests never block.",
proactive_refresh_buffer_seconds:
Expand Down
77 changes: 77 additions & 0 deletions src/plugin.ts
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import {
import {
buildThinkingWarmupBody,
isGenerativeLanguageRequest,
isUnsupportedClaudeLongContextBetaError,
prepareAntigravityRequest,
transformAntigravityResponse,
} from "./plugin/request";
Expand All @@ -50,6 +51,7 @@ import { checkAccountsQuota } from "./plugin/quota";
import { initDiskSignatureCache } from "./plugin/cache";
import { createProactiveRefreshQueue, type ProactiveRefreshQueue } from "./plugin/refresh-queue";
import { initLogger, createLogger } from "./plugin/logger";
import { scrubTextForLog } from "./plugin/logging-utils";
import { initHealthTracker, getHealthTracker, initTokenTracker, getTokenTracker } from "./plugin/rotation";
import { initAntigravityVersion } from "./plugin/version";
import { executeSearch } from "./plugin/search";
Expand All @@ -67,6 +69,7 @@ const MAX_OAUTH_ACCOUNTS = 10;
const MAX_WARMUP_SESSIONS = 1000;
const MAX_WARMUP_RETRIES = 2;
const CAPACITY_BACKOFF_TIERS_MS = [5000, 10000, 20000, 30000, 60000];
const CLAUDE_LONG_CONTEXT_REJECTION_REASON_MAX_CHARS = 240;

function getCapacityBackoffDelay(consecutiveFailures: number): number {
const index = Math.min(consecutiveFailures, CAPACITY_BACKOFF_TIERS_MS.length - 1);
Expand All @@ -86,6 +89,8 @@ const log = createLogger("plugin");
const rateLimitToastCooldowns = new Map<string, number>();
const RATE_LIMIT_TOAST_COOLDOWN_MS = 5000;
const MAX_TOAST_COOLDOWN_ENTRIES = 100;
const CLAUDE_LONG_CONTEXT_FALLBACK_TOAST_SESSIONS = new Set<string>();
const MAX_CLAUDE_LONG_CONTEXT_FALLBACK_TOAST_SESSIONS = 500;

// Track if "all accounts blocked" toasts were shown to prevent spam in while loop
let softQuotaToastShown = false;
Expand Down Expand Up @@ -117,6 +122,25 @@ function shouldShowRateLimitToast(message: string): boolean {
return true;
}

function shouldShowClaudeLongContextFallbackToast(sessionKey: string): boolean {
if (CLAUDE_LONG_CONTEXT_FALLBACK_TOAST_SESSIONS.has(sessionKey)) {
return false;
}

CLAUDE_LONG_CONTEXT_FALLBACK_TOAST_SESSIONS.add(sessionKey);
if (
CLAUDE_LONG_CONTEXT_FALLBACK_TOAST_SESSIONS.size >
MAX_CLAUDE_LONG_CONTEXT_FALLBACK_TOAST_SESSIONS
) {
const first = CLAUDE_LONG_CONTEXT_FALLBACK_TOAST_SESSIONS.values().next().value;
if (first !== undefined) {
CLAUDE_LONG_CONTEXT_FALLBACK_TOAST_SESSIONS.delete(first);
}
}

return true;
}

function resetAllAccountsBlockedToasts(): void {
softQuotaToastShown = false;
rateLimitToastShown = false;
Expand Down Expand Up @@ -1537,6 +1561,7 @@ export const createAntigravityPlugin = (providerId: string) => async (

let lastFailure: FailureContext | null = null;
let lastError: Error | null = null;
let disableClaudeLongContextBetaForRetry = false;
const abortSignal = init?.signal ?? undefined;

// Helper to check if request was aborted
Expand Down Expand Up @@ -2021,6 +2046,9 @@ export const createAntigravityPlugin = (providerId: string) => async (
{
claudeToolHardening: config.claude_tool_hardening,
claudePromptAutoCaching: config.claude_prompt_auto_caching,
claudeLongContextBetaEnabled: config.claude_long_context_beta,
claudeLongContextBetaHeader: config.claude_long_context_beta_header,
disableClaudeLongContextBetaForRetry,
fingerprint: account.fingerprint,
},
);
Expand Down Expand Up @@ -2330,6 +2358,55 @@ export const createAntigravityPlugin = (providerId: string) => async (
}
}

const canBeClaudeLongContextRejection =
response.status === 400
|| response.status === 403
|| response.status === 422;

if (
prepared.claudeLongContextBetaApplied
&& !disableClaudeLongContextBetaForRetry
&& canBeClaudeLongContextRejection
) {
const errorBodyText = await response.clone().text().catch(() => "");
if (
isUnsupportedClaudeLongContextBetaError(
response.status,
errorBodyText,
prepared.claudeLongContextBetaHeader,
)
) {
disableClaudeLongContextBetaForRetry = true;

const sessionKey = prepared.sessionId
?? `${account.index}:${prepared.effectiveModel ?? "claude"}`;

if (shouldShowClaudeLongContextFallbackToast(sessionKey)) {
await showToast(
"Claude long-context beta rejected by provider. Falling back to stable 200k path.",
"warning",
);
}

const reasonPreview = scrubTextForLog(
errorBodyText,
CLAUDE_LONG_CONTEXT_REJECTION_REASON_MAX_CHARS,
);
pushDebug(
`claude-long-context-beta rejected status=${response.status} header=${prepared.claudeLongContextBetaHeader ?? "unknown"} reason=${reasonPreview}`,
);
log.debug("claude-long-context-beta-rejected", {
status: response.status,
model: prepared.effectiveModel,
header: prepared.claudeLongContextBetaHeader,
reasonPreview,
});

i -= 1;
continue;
}
}

const shouldRetryEndpoint = (
response.status === 403 ||
response.status === 404 ||
Expand Down
4 changes: 2 additions & 2 deletions src/plugin/config/models.ts
Original file line number Diff line number Diff line change
Expand Up @@ -68,12 +68,12 @@ export const OPENCODE_MODEL_DEFINITIONS: OpencodeModelDefinitions = {
},
},
"antigravity-claude-sonnet-4-6": {
name: "Claude Sonnet 4.6 (Antigravity)",
name: "Claude Sonnet 4.6 (Antigravity, 200k base)",
limit: { context: 200000, output: 64000 },
modalities: DEFAULT_MODALITIES,
},
"antigravity-claude-opus-4-6-thinking": {
name: "Claude Opus 4.6 Thinking (Antigravity)",
name: "Claude Opus 4.6 Thinking (Antigravity, 200k base)",
limit: { context: 200000, output: 64000 },
modalities: DEFAULT_MODALITIES,
variants: {
Expand Down
47 changes: 46 additions & 1 deletion src/plugin/config/schema.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,12 @@ describe("cli_first config", () => {
it("documents cli_first in the JSON schema", () => {
const schemaPath = new URL("../../../assets/antigravity.schema.json", import.meta.url);
const schema = JSON.parse(readFileSync(schemaPath, "utf8")) as {
properties?: Record<string, { type?: string; default?: unknown; description?: string }>;
properties?: Record<string, {
type?: string
default?: unknown
description?: string
minLength?: number
}>;
};

const cliFirst = schema.properties?.cli_first;
Expand Down Expand Up @@ -46,3 +51,43 @@ describe("claude_prompt_auto_caching config", () => {
expect(claudePromptAutoCaching?.description?.length ?? 0).toBeGreaterThan(0);
});
});

describe("claude_long_context_beta config", () => {
it("includes claude_long_context_beta defaults in DEFAULT_CONFIG", () => {
expect(DEFAULT_CONFIG).toHaveProperty("claude_long_context_beta", false);
expect(DEFAULT_CONFIG).toHaveProperty("claude_long_context_beta_header", "context-1m-2025-08-07");
});

it("documents claude_long_context_beta in the JSON schema", () => {
const schemaPath = new URL("../../../assets/antigravity.schema.json", import.meta.url);
const schema = JSON.parse(readFileSync(schemaPath, "utf8")) as {
properties?: Record<string, { type?: string; default?: unknown; description?: string }>;
};

const claudeLongContextBeta = schema.properties?.claude_long_context_beta;
expect(claudeLongContextBeta).toBeDefined();
expect(claudeLongContextBeta).toMatchObject({
type: "boolean",
default: false,
});
expect(typeof claudeLongContextBeta?.description).toBe("string");
expect(claudeLongContextBeta?.description?.length ?? 0).toBeGreaterThan(0);
});

it("documents claude_long_context_beta_header in the JSON schema", () => {
const schemaPath = new URL("../../../assets/antigravity.schema.json", import.meta.url);
const schema = JSON.parse(readFileSync(schemaPath, "utf8")) as {
properties?: Record<string, { type?: string; default?: unknown; description?: string }>;
};

const claudeLongContextBetaHeader = schema.properties?.claude_long_context_beta_header;
expect(claudeLongContextBetaHeader).toBeDefined();
expect(claudeLongContextBetaHeader).toMatchObject({
type: "string",
default: "context-1m-2025-08-07",
minLength: 1,
});
expect(typeof claudeLongContextBetaHeader?.description).toBe("string");
expect(claudeLongContextBetaHeader?.description?.length ?? 0).toBeGreaterThan(0);
});
});
22 changes: 22 additions & 0 deletions src/plugin/config/schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,26 @@ export const AntigravityConfigSchema = z.object({
* @default false
*/
claude_prompt_auto_caching: z.boolean().default(false),

/**
* Enable experimental Claude long-context beta header injection.
*
* When enabled, Claude 4.6 models attempt a provider beta header for 1M context.
* If the provider rejects the beta header, the request automatically falls back
* to the stable 200k path.
*
* @default false
*/
claude_long_context_beta: z.boolean().default(false),

/**
* Claude long-context beta header value.
*
* Override this if provider beta token changes in the future.
*
* @default "context-1m-2025-08-07"
*/
claude_long_context_beta_header: z.string().trim().min(1).default("context-1m-2025-08-07"),

// =========================================================================
// Proactive Token Refresh (ported from LLM-API-Key-Proxy)
Expand Down Expand Up @@ -460,6 +480,8 @@ export const DEFAULT_CONFIG: AntigravityConfig = {
tool_id_recovery: true,
claude_tool_hardening: true,
claude_prompt_auto_caching: false,
claude_long_context_beta: false,
claude_long_context_beta_header: "context-1m-2025-08-07",
proactive_token_refresh: true,
proactive_refresh_buffer_seconds: 1800,
proactive_refresh_check_interval_seconds: 300,
Expand Down
Loading