Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Roadmap] Add a toggle-able thinking mode for models/providers that allow for it #768

Open
tranhoangnguyen03 opened this issue Feb 26, 2025 · 3 comments

Comments

@tranhoangnguyen03
Copy link

Why
More and more models/providers are offering hybrid LLMs: regular fast response & thinking mode response.

Description
One example is Anthropic Claude API: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking
Here, you simply add a thinking param in the API request to enable the thinking mode:

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-3-7-sonnet-20250219",
    max_tokens=20000,
    
    ###### thinking params ######
    thinking={
        "type": "enabled",
        "budget_tokens": 16000
    },
    #########################

    messages=[{
        "role": "user",
        "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"
    }]
)

print(response)

The corresponding content will contain the thoughts in addition to the regular response

{
    "content": [
        {
            "type": "thinking",
            "thinking": "To approach this, let's think about what we know about prime numbers...",
            "signature": "zbbJhbGciOiJFU8zI1NiIsImtakcjsu38219c0.eyJoYXNoIjoiYWJjMTIzIiwiaWFxxxjoxNjE0NTM0NTY3fQ...."
        },
        {
            "type": "text",
            "text": "Yes, there are infinitely many prime numbers such that..."
        }
    ]
}

Requirements

  • Need to compile a list of providers (including self-hosted options such as VLLM, Llama.cpp, and Ollama,...) that provide this capability and how each of them handles the request and response.
  • Need to be able to detect which model offers this mode (Sonnet 3.5 does not, but Sonnet 3.7 does)
  • Need some changes to the UI to enable this thinking mode for appropriate models and to specify thinking budget
  • Some dependencies with the cost calculations as well
@enricoros
Copy link
Owner

@tranhoangnguyen03 I've completed this support. Which Big-AGI do you use? wbesite, docker, ?

@tranhoangnguyen03
Copy link
Author

I think I'm on an old docker build. Let me update my deployment. Thanks!

@enricoros
Copy link
Owner

@tranhoangnguyen03 Please let me know if this works as expected, and whether the support could be extended in other directions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants