Skip to content

Conversation

@teamdandelion
Copy link
Collaborator

llm.ThinkingConfig has level (minimal / low / medium / high) and can be
unset for auto. It also has encode_thoughts_as_text

I dropped include_summary because really it's only Google that let's you
enable/disable summaries (openai allows auto / verbose / concise, but
concise is only for computer use models apparently). Since we don't have
at least 2 providers with consistent semantics, I am just turning them
on for Google by default for consistent behavior. If someone wants the
ability to disable them for Google we can add that in 2.x once we get a
request for it.

@teamdandelion teamdandelion marked this pull request as ready for review January 8, 2026 06:23
Copy link
Collaborator Author

teamdandelion commented Jan 8, 2026

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Jan 8, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
v2-docs 165afa7 Jan 09 2026, 02:07 AM

"minimal": 0,
"low": 1024,
"medium": 2048,
"high": 8192,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this seems weird:

  1. should be LEVEL_TO_BUDGET?
  2. I feel like this should map to a percentage of max tokens? Seems weird that we hard limit the user to 8192 thinking budget tokens

from ..model_id import AnthropicModelId, model_name

DEFAULT_MAX_TOKENS = 16000
# TODO: Change DEFAULT_FORMAT_MODE to strict when strict is no longer a beta feature.
DEFAULT_FORMAT_MODE = "tool"

# Effort level to token budget mapping for thinking conversion
EFFORT_TO_BUDGET = {
"minimal": 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems weird that "minimal" here is 0, meaning actually disabled? that seems semantically incorrect. I thought we spoke about having something like disabled: True in the config for this? I feel like minimal should still set some form of budget?

from typing import TypedDict
from typing import Literal, TypedDict

ThinkingLevel = Literal["minimal", "low", "medium", "high"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing docstring. I like having the longer one in ThinkingConfig, so a single line here suffices (can even be 1:1 copy-paste with the short description below)

from ..model_id import GoogleModelId, model_name
from ..model_info import MODELS_WITHOUT_STRUCTURED_OUTPUT_AND_TOOLS_SUPPORT

UNKNOWN_TOOL_ID = "google_unknown_tool_id"

EFFORT_TO_BUDGET = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments here

llm.ThinkingConfig has level (minimal / low / medium / high) and can be
unset for auto. It also has encode_thoughts_as_text

I dropped include_summary because really it's only Google that let's you
enable/disable summaries (openai allows auto / verbose / concise, but
concise is only for computer use models apparently). Since we don't have
at least 2 providers with consistent semantics, I am just turning them
on for Google by default for consistent behavior. If someone wants the
ability to disable them for Google we can add that in 2.x once we get a
request for it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants