Added gemini thinking model support, with a default of gemini-2.0-flash-thinking-exp-01-21 #56

jamisonl · 2025-02-10T09:45:30Z

Hi @dzhng, hope the below screenshot adds more context for using the @google/generative-ai package instead of the vercel wrapper. This feature lets the user choose between o3-mini and gemini-2.0-flash-thinking (defaults to o3-mini) in the CLI. I've tested it and it's outputting markdown reports. Wondering if there's any interest in adding r1..

…sh-thinking-exp-01-21

carterlasalle · 2025-02-10T22:23:33Z

Im trying to implement it, but this is from the docs:
Using Natural language

Large language models are a powerfuls multitask tools. Often you can just ask Gemini for what you want, and it will do okay.

The Gemini API doesn't have a JSON mode, so there are a few things to watch for when generating data structures this way:

Sometimes parsing fails.
The schema can't be strictly enforced.

You'll solve those problems in the next section. First, try a simple natural language prompt with the schema written out as text. This has not been optimized:

MODEL_ID="gemini-2.0-flash"
prompt = """
Please return JSON describing the people, places, things and relationships from this story using the following schema:

{"people": list[PERSON], "places":list[PLACE], "things":list[THING], "relationships": list[RELATIONSHIP]}

PERSON = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
PLACE = {"name": str, "description": str}
THING = {"name": str, "description": str, "start_place_name": str, "end_place_name": str}
RELATIONSHIP = {"person_1_name": str, "person_2_name": str, "relationship": str}

All fields are required.

Important: Only return a single piece of valid JSON text.

Here is the story:

""" + story

response = client.models.generate_content(
model=MODEL_ID,
contents=prompt,
config=types.GenerateContentConfig(
response_mime_type="application/json"
),
)

That returned a json string. Try parsing it:

import json

print(json.dumps(json.loads(response.text), indent=4))

{
"people": [
{
"name": "Elara",
"description": "A wisp of a girl with eyes the color of a stormy sea, who can hear the forest's secrets.",
"start_place_name": "Havenwood",
"end_place_name": "Havenwood"
},
{
"name": "Arthur",
"description": "Elara's father, a quiet carpenter with sawdust permanently clinging to his eyebrows.",
"start_place_name": "Havenwood",
"end_place_name": "Havenwood"
},
{
"name": "Clara",
"description": "Elara's mother, a baker whose cinnamon rolls were legendary.",
"start_place_name": "Havenwood",
"end_place_name": "Havenwood"
},
{
"name": "Silas",
"description": "A gruff, barrel-chested man with a glint of steel in his eyes who wanted to drain the magic from Havenwood.",
"start_place_name": "Havenwood",
"end_place_name": "Havenwood"
},
{
"name": "Thomas",
"description": "The town\u2019s librarian, a kind, bookish man.",
"start_place_name": "Havenwood",
"end_place_name": "Havenwood"
}
],
"places": [
{
"name": "Havenwood",
"description": "A quiet town nestled beside a whispering forest."
},
{
"name": "Whispering Woods",
"description": "A forest beside Havenwood that murmurs secrets."
},
{
"name": "Old Mill",
"description": "A dilapidated old mill on the edge of Havenwood."
}
],
"things": [
{
"name": "Magic Backpack",
"description": "A simple, worn, leather backpack that holds magical items.",
"start_place_name": "Elara's House",
"end_place_name": "Havenwood"
},
{
"name": "Honey Jar",
"description": "A bottomless jar of honey.",
"start_place_name": "Magic Backpack",
"end_place_name": "Magic Backpack"
},
{
"name": "Wooden Bird",
"description": "A small, intricately carved wooden bird that can fly messages.",
"start_place_name": "Magic Backpack",
"end_place_name": "Magic Backpack"
},
{
"name": "Iridescent Dust",
"description": "Shimmering dust that can mend anything broken.",
"start_place_name": "Magic Backpack",
"end_place_name": "Magic Backpack"
},
{
"name": "Compass",
"description": "A compass that points towards what is needed most.",
"start_place_name": "Magic Backpack",
"end_place_name": "Magic Backpack"
},
{
"name": "Leather-bound Book",
"description": "A small, leather-bound book filled with blank pages that fill with the perfect story, poem, or spell whenever needed.",
"start_place_name": "Magic Backpack",
"end_place_name": "Magic Backpack"
},
{
"name": "Cinnamon Rolls",
"description": "Legendary cinnamon rolls made by Clara.",
"start_place_name": "Havenwood",
"end_place_name": "Havenwood"
}
],
"relationships": [
{
"person_1_name": "Elara",
"person_2_name": "Arthur",
"relationship": "Daughter-Father"
},
{
"person_1_name": "Elara",
"person_2_name": "Clara",
"relationship": "Daughter-Mother"
},
{
"person_1_name": "Elara",
"person_2_name": "Thomas",
"relationship": "Friends"
},
{
"person_1_name": "Elara",
"person_2_name": "Silas",
"relationship": "Adversaries"
},
{
"person_1_name": "Arthur",
"person_2_name": "Clara",
"relationship": "Spouses"
}
]
}

That's relatively simple and often works, but you can potentially make this more strict/robust by defining the schema using the API's function calling feature.

jamisonl · 2025-02-10T23:31:26Z

I opted to use the official Gemini package here rather than Vercel's wrapper since I encountered that JSON mode compatibility issue. Direct integration with the official package gives better control and helps avoid the abstraction-related errors. I've used an example schema as a way to make the gemini model return JSON and markdown, which is a technique that works pretty broadly across LLMs. This example schema is in the providers.ts file.

carterlasalle · 2025-02-11T04:20:34Z

I opted to use the official Gemini package here rather than Vercel's wrapper since I encountered that JSON mode compatibility issue. Direct integration with the official package gives better control and helps avoid the abstraction-related errors. I've used an example schema as a way to make the gemini model return JSON and markdown, which is a technique that works pretty broadly across LLMs. This example schema is in the providers.ts file.

Wait, so is this a working implementation of gemini?

jamisonl · 2025-02-11T05:28:09Z

I opted to use the official Gemini package here rather than Vercel's wrapper since I encountered that JSON mode compatibility issue. Direct integration with the official package gives better control and helps avoid the abstraction-related errors. I've used an example schema as a way to make the gemini model return JSON and markdown, which is a technique that works pretty broadly across LLMs. This example schema is in the providers.ts file.

Wait, so is this a working implementation of gemini?

Yeah! It's pretty speedy too, the main limitation is firecrawl rate-limiting if you're a free user like me. You can pull it down from my fork and play around with it

Shreyas9400 · 2025-02-11T05:31:04Z

Hey thanks for the Gemini addition, works seamlessly. I am trying to integrate this with other search APIs as Firecrawl provides only 500 free credits and rate limit on their free tier option, is it possible to integrate SearXNG or Tavily API into this. Thanks

carterlasalle · 2025-02-11T16:37:18Z

@dzhng @Dariton4000 is this ready to merge then?

jamisonl · 2025-02-11T16:42:08Z

Hey thanks for the Gemini addition, works seamlessly. I am trying to integrate this with other search APIs as Firecrawl provides only 500 free credits and rate limit on their free tier option, is it possible to integrate SearXNG or Tavily API into this. Thanks

Yes, a choice of web scrapers is a sensible next step. Firecrawl free tier is pretty limiting.

carterlasalle · 2025-02-11T17:15:27Z

Hey thanks for the Gemini addition, works seamlessly. I am trying to integrate this with other search APIs as Firecrawl provides only 500 free credits and rate limit on their free tier option, is it possible to integrate SearXNG or Tavily API into this. Thanks

Yes, a choice of web scrapers is a sensible next step. Firecrawl free tier is pretty limiting.

Im self-hosting right now, and struggling a bit. Also the gemini keeps getting

GoogleGenerativeAIFetchError: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-thinking-exp-01-21:generateContent: [429 Too Many Requests] Resource has been exhausted (e.g. check quota).
    at handleResponseNotOk (/Users/rocket/deep-research-1/node_modules/@google/generative-ai/dist/index.js:414:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async makeRequest (/Users/rocket/deep-research-1/node_modules/@google/generative-ai/dist/index.js:387:9)solated populations
    at async generateContent (/Users/rocket/deep-research-1/node_modules/@google/generative-ai/dist/index.js:832:22)
    at async GeminiProvider.generateObject (/Users/rocket/deep-research-1/src/ai/providers.ts:185:20)
    at async writeFinalReport (/Users/rocket/deep-research-1/src/deep-research.ts:154:15)
    at async run (/Users/rocket/deep-research-1/src/run.ts:103:18) {
  status: 429,
  statusText: 'Too Many Requests',
  errorDetails: undefined

so we may need to implement a rate limit with an exponential backoff.

samyogdhital · 2025-02-11T17:29:30Z

@jamisonl I saw somewhere when using thinking model, I was getting in middle of research, JSON at position 0 unable to decode or something like that.
Sorry, I am not able to give exact error. Will update if I see that again.

Have you also noticed any error like that?

Shreyas9400 · 2025-02-11T17:49:12Z

Hey thanks for the Gemini addition, works seamlessly. I am trying to integrate this with other search APIs as Firecrawl provides only 500 free credits and rate limit on their free tier option, is it possible to integrate SearXNG or Tavily API into this. Thanks

Yes, a choice of web scrapers is a sensible next step. Firecrawl free tier is pretty limiting.

Hey i have added searxng instance on this repo https://github.com/Shreyas9400/deep-research-searxng

I have deployed the instance on huggingface due to resource constraints, however, the processing time is high due to search and parse which takes additional time.

dzhng · 2025-02-11T19:37:28Z

Hey - I spoke to the AI SDK guys and I think I'll keep this PR open but not merge. The reason is that the only reason for adding the complexity of gemini specific packages is to use their thinking model, which doesn't support tool calling yet.

BUT, that's coming, this is just an experimental model. You CAN use the normal gemini-flash-2.0 model right now with AI SDK fine. I rather keep this simple and just rely on one llm interface package (AI SDK).

If you really want to try gemini thinking model, this is a good reference implementation.

jamisonl · 2025-02-11T20:08:59Z

Hey - I spoke to the AI SDK guys and I think I'll keep this PR open but not merge. The reason is that the only reason for adding the complexity of gemini specific packages is to use their thinking model, which doesn't support tool calling yet.

BUT, that's coming, this is just an experimental model. You CAN use the normal gemini-flash-2.0 model right now with AI SDK fine. I rather keep this simple and just rely on one llm interface package (AI SDK).

If you really want to try gemini thinking model, this is a good reference implementation.

That's a rational design choice! DeepSeek also doesn't have tool use/structured output yet. I look forward to the other reasoning models having feature parity. Any interest in making an everything but the kitchen sink version?

dzhng · 2025-02-17T01:04:21Z

honestly I don't think I have enough time to do an kitchen sink version, also that's a looong rabbit hole to go down in haha. my guess is the architecture will evolve as new models / capabilities are unlocked, so it's better to keep a bare minimum ver, that's constantly updated with what the current SOTA implementation is.

Added gemini thinking model support, with a default of gemini-2.0-fla…

f389a3a

…sh-thinking-exp-01-21

Dariton4000 approved these changes Feb 11, 2025

View reviewed changes

dzhng mentioned this pull request Feb 11, 2025

Feature/implemented changes - Gemini Thinking model #31

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added gemini thinking model support, with a default of gemini-2.0-flash-thinking-exp-01-21 #56

Added gemini thinking model support, with a default of gemini-2.0-flash-thinking-exp-01-21 #56

jamisonl commented Feb 10, 2025

carterlasalle commented Feb 10, 2025

jamisonl commented Feb 10, 2025

carterlasalle commented Feb 11, 2025

jamisonl commented Feb 11, 2025

Shreyas9400 commented Feb 11, 2025

carterlasalle commented Feb 11, 2025

jamisonl commented Feb 11, 2025

carterlasalle commented Feb 11, 2025

samyogdhital commented Feb 11, 2025

Shreyas9400 commented Feb 11, 2025

dzhng commented Feb 11, 2025 •

edited

Loading

jamisonl commented Feb 11, 2025 •

edited

Loading

dzhng commented Feb 17, 2025

Added gemini thinking model support, with a default of gemini-2.0-flash-thinking-exp-01-21 #56

Are you sure you want to change the base?

Added gemini thinking model support, with a default of gemini-2.0-flash-thinking-exp-01-21 #56

Conversation

jamisonl commented Feb 10, 2025

carterlasalle commented Feb 10, 2025

jamisonl commented Feb 10, 2025

carterlasalle commented Feb 11, 2025

jamisonl commented Feb 11, 2025

Shreyas9400 commented Feb 11, 2025

carterlasalle commented Feb 11, 2025

jamisonl commented Feb 11, 2025

carterlasalle commented Feb 11, 2025

samyogdhital commented Feb 11, 2025

Shreyas9400 commented Feb 11, 2025

dzhng commented Feb 11, 2025 • edited Loading

jamisonl commented Feb 11, 2025 • edited Loading

dzhng commented Feb 17, 2025

dzhng commented Feb 11, 2025 •

edited

Loading

jamisonl commented Feb 11, 2025 •

edited

Loading