-
-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Description
What happened?
Bug Report: Ollama Turbo authentication not working with LiteLLM
Issue Description
LiteLLM is unable to authenticate with Ollama Turbo, receiving a 401 Unauthorized error, while the native Ollama client works correctly with the same credentials.
Environment
- LiteLLM version: 1.75.5.post1
- Python version: 3.11
- Ollama Turbo API endpoint: https://ollama.com
Steps to Reproduce
- Use LiteLLM with Ollama Turbo:
from litellm import completion
api_key="<SUBSCRIPTION_KEY>" # Turbo subscription key
response = completion(
model="ollama/gpt-oss:120b",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain Ollama Turbo usage via LiteLLM."}
],
api_base="https://ollama.com",
api_key=api_key
)
print(response.choices[0].message.content)- This results in:
httpx.HTTPStatusError: Client error '401 Unauthorized' for url 'https://ollama.com/api/generate'
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: OllamaException - {"error": "unauthorized"}
Working Example with Native Client
The same API key works correctly with the native Ollama client:
from ollama import Client
api_key="<SUBSCRIPTION_KEY>" # Same key
client = Client(
host="https://ollama.com",
headers={'Authorization': f'{api_key}'}
)
messages = [
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]
for part in client.chat('gpt-oss:120b', messages=messages, stream=True):
print(part['message']['content'], end='', flush=True)This works successfully and returns the expected response.
Expected Behavior
LiteLLM should be able to authenticate with Ollama Turbo using the provided API key, similar to how the native Ollama client does.
Analysis
The issue appears to be in how LiteLLM formats the authentication headers for Ollama Turbo. The native client uses:
headers={'Authorization': f'{api_key}'}
While LiteLLM seems to be sending the authentication differently, resulting in the 401 error.
Additional Context
- The error occurs at
https://ollama.com/api/generateendpoint - Both examples use the same model:
gpt-oss:120b - The native client works with streaming enabled
Possible Solution
LiteLLM may need to update its Ollama provider implementation to correctly format the Authorization header for Ollama Turbo's API, matching the format used by the native client.
Relevant log output
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
1.75.5.post1
Twitter / LinkedIn details
No response