Skip to content

feat(providers/google): improve cachedContent, expose rich token metadata and pass mediaResolution #6256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ItsWendell
Copy link

@ItsWendell ItsWendell commented May 9, 2025

Summary

This pull request improves support for some of the features of Google's Generative AI and Vertex AI. See the list below.

  • Modifies requests to Google's APIs to exclude tools, toolConfig, and systemInstruction when cachedContent is provided. This aligns with Google's requirements for utilizing cached content effectively.
  • The providerMetadata now includes a comprehensive set of token counts and details from Google's API response for both streaming and non-streaming operations:
    • cachedContentTokenCount
    • thoughtsTokenCount (defaults to 0 if thinkingBudget is explicitly set to 0)
    • promptTokensDetails (array, with modality and token count)
    • cacheTokensDetails (array, with modality and token count)
    • candidatesTokensDetails (array, with modality and token count)
    • toolUsePromptTokensDetails (array, with modality and token count)
  • The mediaResolution option has been added to GoogleGenerativeAISettings and is now passed to the Google API, allowing users to specify media resolution for vision capabilities.
  • Added tests to verify the correct handling of cachedContent parameters and the accurate extraction and exposure of all new token metadata fields.

Background

Previously, support for Google's caching features and detailed token reporting in the Vercel AI SDK was limited. For cachedContent to function correctly, specific parameters must be omitted from the request. This PR implements this as conditional logic. It also exposes some of the additional Google-specific token metadata by modality, which provides users with deeper insights into their token consumption, and model interactions. The mediaResolution setting allows images uploaded to the models to be interpreted in a higher, or lower resolution.

Verification

  • Unit tests have been added and are passing, covering:
    • Conditional parameter submission when cachedContent is used.
    • Extraction of all new token metadata fields in providerMetadata for doGenerate and doStream.
    • Handling of cases where token metadata is present, partially present, or absent in the API response.
  • Linked the @ai-sdk/google and Vertex provider packages locally into an active project to confirm end-to-end functionality of cachedContent improvements and metadata exposure.

Tasks

  • Tests have been added / updated (for bug fixes / features)
  • Documentation has been added / updated (for bug fixes / features)
    • Note: Documentation needs to be updated for:
      • The corrected usage of cachedContent.
      • All new token count and detail fields in providerMetadata (e.g., cachedContentTokenCount, thoughtsTokenCount, promptTokensDetails, etc.).
      • The availability and usage of the mediaResolution setting.
  • A minor changeset for relevant packages has been added (for bug fixes / features - run pnpm changeset in the project root)
  • Formatting issues have been fixed (run pnpm prettier-fix in the project root)

Future Work

Related Issues

@ItsWendell ItsWendell changed the title feat(providers/google): improvements to google provider cachedContent, add MediaResolution, return cachedContentTokenCount in providerMetadata feat(providers/google): Enhance cachedContent handling, expose token counts, introduce mediaResolution setting May 9, 2025
@ItsWendell ItsWendell force-pushed the feat/google-gen-ai-cache-image-res branch from 439aca8 to 7e03fc9 Compare May 9, 2025 21:46
@ItsWendell ItsWendell changed the title feat(providers/google): Enhance cachedContent handling, expose token counts, introduce mediaResolution setting feat(providers/google): improve cachedContent, expose rich token metadata and pass mediaResolution May 9, 2025
@ItsWendell ItsWendell force-pushed the feat/google-gen-ai-cache-image-res branch 6 times, most recently from dfed449 to b156993 Compare May 14, 2025 13:05
@ItsWendell ItsWendell force-pushed the feat/google-gen-ai-cache-image-res branch from b156993 to 5ffa4ab Compare May 14, 2025 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gemini Context caching
1 participant