Skip to content

Conversation

@avinash2692
Copy link
Contributor

@avinash2692 avinash2692 commented Dec 24, 2025

@avinash2692 avinash2692 changed the title Fix OpenAI reasoning effort: fix: OpenAI reasoning effort Dec 24, 2025
@mergify
Copy link

mergify bot commented Dec 24, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

@nrfulton nrfulton self-requested a review December 26, 2025 20:16
Copy link
Contributor

@nrfulton nrfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Requested a comment.

@avinash2692 : after adding the comment please feel free to break glass and merge without a code review.

@avinash2692: FYI, NVM. I'm rolling the fix to #274 into this PR. That fix requires discussion with the rest of the core team to make sure there are no objections or subtle bugs introduced by defaulting to self._base_url = None for the OpenAI backend.

Comment on lines 634 to 638
# Build optional reasoning parameters
reasoning_params = {}
if thinking is not None:
reasoning_params["reasoning_effort"] = thinking

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add a comment noting that OpenAI doesn't like it when non-reasoning models get reasoning parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@nrfulton
Copy link
Contributor

nrfulton commented Dec 26, 2025

FYI @jakelorocco and @HendrikStrobelt:

We currently default to ollama when the user instantiates the OpenAI backend without a self._base_url. This PR will change that default behavior.

Now, if the user types:

import mellea

mellea.start_session(backend_name="openai", api_key="...")

then the OpenAI SDK's default endpoint will be used.

However, if the user types:

import mellea

mellea.start_session(backend_name="openai")

without an API key argument, then the ollama backend will be used and a warning will be printed.

Note that the user must still explicitly provide an api_key kwarg. Having an OpenAI API key in their ENV is not enough. This should prevent accidental expenses / data leaks while still providing less surprising behavior for users who are using the OpenAI backend.

@nrfulton nrfulton changed the title fix: OpenAI reasoning effort fix: OpenAI base_url default and reasoning effort model option. Dec 26, 2025
This default is changed because the default base_url is also changed.
The OpenAI response_format only accepts a limited set of schemas and
will error out with a 400 if you do not follow their guidelines.

One of these guidelines is that additionalProperties is set and is set
to False.

This commit monkey-patches the response_format provided to OpenAI
platform backends, and leaves other OpenAI-"compatible" backends with
the existing default behavior. This is a debatable choice.

See https://community.openai.com/t/api-rejects-valid-json-schema/906163 and https://platform.openai.com/docs/guides/structured-outputs?api-mode=chat
@nrfulton
Copy link
Contributor

nrfulton commented Jan 3, 2026

We should also review @jakelorocco 's notes in #141 before merging.

@nrfulton nrfulton linked an issue Jan 3, 2026 that may be closed by this pull request
@nrfulton nrfulton self-requested a review January 3, 2026 16:07
Comment on lines 155 to 160
if api_key is None:
FancyLogger.get_logger().warning(
"You are using an OpenAI backend with no api_key. Because no API key was provided, mellea assumes you intend to use the openai-compatible interface to your local ollama instance. If you intend to use OpenAI's platform you must specify your API key when instantiating your Mellea session/backend object."
)
self._base_url: str | None = "http://localhost:11434/v1" # ollama
self._api_key = "ollama"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are close to defaults that make sense here. I think if the user specifies a base_url we should always use that base_url (even if no apikey is set). I also wonder if we should default the apikey to ollama in those situations.

Otherwise, we have no way to target arbitrary localhost ports that don't require an apikey.

For example (and this isn't the best since it uses LiteLLM and we have a separate backend for that), LiteLLM has a proxy that you can run locally. This proxy stores the apikey information itself; so you can target an arbitrary localhost port without an apikey.

My proposed solution would be to just set the parameter default values to work for the ollama version (ie api_key="ollama" and base_url="http://localhost:11434/v1"). Then users can override these values. I think this would also allow users to explicitly set api_key / base_url to None and have the underlying OpenAI SDK still automatically pick up their env vars (without the risk of users accidentally incurring expenses).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consensus: just pass the args through to the openai sdk. Don't do argument handling such as this.

@jakelorocco
Copy link
Contributor

We should also review @jakelorocco 's notes in #141 before merging.

This issues fixes everything but the model id selection mentioned in #141. I'm ambivalent on if we actually want to handle that "smartly" or just make users choose the specific name themselves when we pick wrongly. We could develop a heuristic for it based on base_urls, but that seems prone to errors. If others agree, we can close #141 when this PR is merged.

@nrfulton
Copy link
Contributor

nrfulton commented Jan 5, 2026

We should also review @jakelorocco 's notes in #141 before merging.

This issues fixes everything but the model id selection mentioned in #141. I'm ambivalent on if we actually want to handle that "smartly" or just make users choose the specific name themselves when we pick wrongly. We could develop a heuristic for it based on base_urls, but that seems prone to errors. If others agree, we can close #141 when this PR is merged.

Discussed in leads.

In the initializer, these lines of code need to change:
https://github.com/generative-computing/mellea/blob/e7e161bc14f4ba419d5bcf69954686cbe86ff75b/mellea/backends/openai.py#L144C1-L154C1

New logic:

  1. If model_id : str then that self._model_id = model_id is used.
  2. If model_id : ModelIdentifier then self._model_id = model_id.openai_name.
  3. Delete all references to the self._hf_model_id field and replace with self._model_id.

Also ensure that this is dead code and then delete:

def apply_chat_template(self, chat: list[dict[str, str]]):
"""Apply the chat template for the model, if such a model is available (e.g., when it can deduce the huggingface model id)."""
from transformers import AutoTokenizer
if not hasattr(self, "_tokenizer"):
assert self._base_url, (
"The OpenAI Platform does not support adapters. You must specify a _base_url when using adapters."
)
match _server_type(self._base_url):
case _ServerType.LOCALHOST:
self._tokenizer: "PreTrainedTokenizer" = ( # noqa: UP037
AutoTokenizer.from_pretrained(self._hf_model_id)
)
case _ServerType.OPENAI:
raise Exception(
"apply_chat_template is called while targeting a server at openai.com. "
"This is not supported --- openai.com does not support Activated Lora. "
"Use a locally served vllm instance. "
)
return self._tokenizer.apply_chat_template(chat, tokenize=False)

Resolution assigned to @avinash2692

@avinash2692
Copy link
Contributor Author

We should also review @jakelorocco 's notes in #141 before merging.

This issues fixes everything but the model id selection mentioned in #141. I'm ambivalent on if we actually want to handle that "smartly" or just make users choose the specific name themselves when we pick wrongly. We could develop a heuristic for it based on base_urls, but that seems prone to errors. If others agree, we can close #141 when this PR is merged.

Discussed in leads.

In the initializer, these lines of code need to change: https://github.com/generative-computing/mellea/blob/e7e161bc14f4ba419d5bcf69954686cbe86ff75b/mellea/backends/openai.py#L144C1-L154C1

New logic:

1. If `model_id : str` then that `self._model_id = model_id` is used.

2. If `model_id : ModelIdentifier` then `self._model_id = model_id.openai_name`.

3. Delete all references to the `self._hf_model_id` field and replace with `self._model_id`.

Also ensure that this is dead code and then delete:

def apply_chat_template(self, chat: list[dict[str, str]]):
"""Apply the chat template for the model, if such a model is available (e.g., when it can deduce the huggingface model id)."""
from transformers import AutoTokenizer
if not hasattr(self, "_tokenizer"):
assert self._base_url, (
"The OpenAI Platform does not support adapters. You must specify a _base_url when using adapters."
)
match _server_type(self._base_url):
case _ServerType.LOCALHOST:
self._tokenizer: "PreTrainedTokenizer" = ( # noqa: UP037
AutoTokenizer.from_pretrained(self._hf_model_id)
)
case _ServerType.OPENAI:
raise Exception(
"apply_chat_template is called while targeting a server at openai.com. "
"This is not supported --- openai.com does not support Activated Lora. "
"Use a locally served vllm instance. "
)
return self._tokenizer.apply_chat_template(chat, tokenize=False)

Resolution assigned to @avinash2692

@nrfulton : the code should now have no mentions of hf_model_id. We still need to keep the apply_chat_template method cause it seems like its being used when running Alora adapters. For now, I've modified the method to make it work with the current code base, but we could remove it after some review of the adapters code base in the future.

Comment on lines 157 to 159
)
self._base_url: str | None = "http://localhost:11434/v1" # ollama
self._api_key = "ollama"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the final verdict here was to not do any fancy handling and just pass args through as None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, sorry missed this.

Comment on lines 1011 to 1025
def apply_chat_template(self, chat: list[dict[str, str]]):
"""Apply the chat template for the model, if such a model is available (e.g., when it can deduce the huggingface model id)."""
from transformers import AutoTokenizer

if not hasattr(self, "_tokenizer"):
assert self._base_url, (
"The OpenAI Platform does not support adapters. You must specify a _base_url when using adapters."
)
match _server_type(self._base_url):
case _ServerType.LOCALHOST:
self._tokenizer: "PreTrainedTokenizer" = ( # noqa: UP037
AutoTokenizer.from_pretrained(self._hf_model_id)
AutoTokenizer.from_pretrained(self._model_id)
)
case _ServerType.OPENAI:
raise Exception(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something, but I don't see this function being utilized anywhere. I see other functions with the same name, but I don't see an OpenAIBackend.apply_chat_template anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was explicitly being used anywhere. But seems like based on the ServerType code being set, Fred seems to have touched that part of the code .. so I assumed that he might be using it somewhere in his code base. I might have to run the adapters tests locally to test that out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to remove it for now and then figure out the repercussions later.

Comment on lines 156 to 172
self._api_key = api_key or os.getenv("OPENAI_API_KEY")
self._base_url = base_url or os.getenv("OPENAI_BASE_URL")

self._server_type = _server_type(self._base_url)
# Validate that we have the required configuration
if self._api_key is None:
raise ValueError(
"OPENAI_API_KEY is required but not set. Please either:\n"
" 1. Set the environment variable: export OPENAI_API_KEY='your-key-here'\n"
" 2. Pass it as a parameter: OpenAIBackend(api_key='your-key-here')"
)

if self._base_url is None:
raise ValueError(
"OPENAI_BASE_URL is required but not set. Please either:\n"
" 1. Set the environment variable: export OPENAI_BASE_URL=<your server url>\n"
" 2. Pass it as a parameter: OpenAIBackend(base_url=<your server url>)"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that we just wanted to let the openai sdk handle this env var grabbing. I think if the value is None, that's fine and we just pass it to the sdk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the openai sdk likes None being passed. Also, might be better to fail early there rather than folks ending up with a long call stack from the openai sdk.

But I think your comments about not grabbing the env vars makes sense. I'll just check for the existence of the env vars when we run the command.

@avinash2692 avinash2692 merged commit 9733df8 into main Jan 6, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenAI backend is hard to use OpenAI backend passes invalid reasoning_effort to non-reasoning models investigate defaults for openai backend

4 participants