Skip to content

[Bug]: Ollama Turbo authentication not working with LiteLLM #13652

@grahama1970

Description

@grahama1970

What happened?

Bug Report: Ollama Turbo authentication not working with LiteLLM

Issue Description

LiteLLM is unable to authenticate with Ollama Turbo, receiving a 401 Unauthorized error, while the native Ollama client works correctly with the same credentials.

Environment

  • LiteLLM version: 1.75.5.post1
  • Python version: 3.11
  • Ollama Turbo API endpoint: https://ollama.com

Steps to Reproduce

  1. Use LiteLLM with Ollama Turbo:
from litellm import completion

api_key="<SUBSCRIPTION_KEY>"  # Turbo subscription key

response = completion(
    model="ollama/gpt-oss:120b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain Ollama Turbo usage via LiteLLM."}
    ],
    api_base="https://ollama.com",
    api_key=api_key
)

print(response.choices[0].message.content)
  1. This results in:
httpx.HTTPStatusError: Client error '401 Unauthorized' for url 'https://ollama.com/api/generate'
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: OllamaException - {"error": "unauthorized"}

Working Example with Native Client

The same API key works correctly with the native Ollama client:

from ollama import Client

api_key="<SUBSCRIPTION_KEY>"  # Same key

client = Client(
    host="https://ollama.com",
    headers={'Authorization': f'{api_key}'}
)

messages = [
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
]

for part in client.chat('gpt-oss:120b', messages=messages, stream=True):
  print(part['message']['content'], end='', flush=True)

This works successfully and returns the expected response.

Expected Behavior

LiteLLM should be able to authenticate with Ollama Turbo using the provided API key, similar to how the native Ollama client does.

Analysis

The issue appears to be in how LiteLLM formats the authentication headers for Ollama Turbo. The native client uses:

  • headers={'Authorization': f'{api_key}'}

While LiteLLM seems to be sending the authentication differently, resulting in the 401 error.

Additional Context

  • The error occurs at https://ollama.com/api/generate endpoint
  • Both examples use the same model: gpt-oss:120b
  • The native client works with streaming enabled

Possible Solution

LiteLLM may need to update its Ollama provider implementation to correctly format the Authorization header for Ollama Turbo's API, matching the format used by the native client.

Relevant log output

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

1.75.5.post1

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions