fix: OpenAI `base_url` default and reasoning effort model option. #271

avinash2692 · 2025-12-24T20:08:07Z

Fixes OpenAI backend passes invalid reasoning_effort to non-reasoning models #270
Fixes OpenAI backend is hard to use #274
Fixes bug where additionalProperties is not set to False on response_formats generated by GenerativeSlot. This caused GenerativeSlot to not work with OpenAI platform's API.

mergify · 2025-12-24T20:08:41Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

nrfulton

LGTM. Requested a comment.

~~@avinash2692 : after adding the comment please feel free to break glass and merge without a code review.~~

@avinash2692: FYI, NVM. I'm rolling the fix to #274 into this PR. That fix requires discussion with the rest of the core team to make sure there are no objections or subtle bugs introduced by defaulting to self._base_url = None for the OpenAI backend.

nrfulton · 2025-12-26T20:19:46Z

mellea/backends/openai.py

+        # Build optional reasoning parameters
+        reasoning_params = {}
+        if thinking is not None:
+            reasoning_params["reasoning_effort"] = thinking
+


Please also add a comment noting that OpenAI doesn't like it when non-reasoning models get reasoning parameters.

nrfulton · 2025-12-26T22:43:59Z

FYI @jakelorocco and @HendrikStrobelt:

We currently default to ollama when the user instantiates the OpenAI backend without a self._base_url. This PR will change that default behavior.

Now, if the user types:

import mellea

mellea.start_session(backend_name="openai", api_key="...")

then the OpenAI SDK's default endpoint will be used.

However, if the user types:

import mellea

mellea.start_session(backend_name="openai")

without an API key argument, then the ollama backend will be used and a warning will be printed.

Note that the user must still explicitly provide an api_key kwarg. Having an OpenAI API key in their ENV is not enough. This should prevent accidental expenses / data leaks while still providing less surprising behavior for users who are using the OpenAI backend.

This default is changed because the default base_url is also changed.

The OpenAI response_format only accepts a limited set of schemas and will error out with a 400 if you do not follow their guidelines. One of these guidelines is that additionalProperties is set and is set to False. This commit monkey-patches the response_format provided to OpenAI platform backends, and leaves other OpenAI-"compatible" backends with the existing default behavior. This is a debatable choice. See https://community.openai.com/t/api-rejects-valid-json-schema/906163 and https://platform.openai.com/docs/guides/structured-outputs?api-mode=chat

nrfulton · 2026-01-03T16:06:55Z

We should also review @jakelorocco 's notes in #141 before merging.

jakelorocco · 2026-01-05T14:31:46Z

mellea/backends/openai.py

        if api_key is None:
+            FancyLogger.get_logger().warning(
+                "You are using an OpenAI backend with no api_key. Because no API key was provided, mellea assumes you intend to use the openai-compatible interface to your local ollama instance. If you intend to use OpenAI's platform you must specify your API key when instantiating your Mellea session/backend object."
+            )
+            self._base_url: str | None = "http://localhost:11434/v1"  # ollama
            self._api_key = "ollama"


I think we are close to defaults that make sense here. I think if the user specifies a base_url we should always use that base_url (even if no apikey is set). I also wonder if we should default the apikey to ollama in those situations.

Otherwise, we have no way to target arbitrary localhost ports that don't require an apikey.

For example (and this isn't the best since it uses LiteLLM and we have a separate backend for that), LiteLLM has a proxy that you can run locally. This proxy stores the apikey information itself; so you can target an arbitrary localhost port without an apikey.

My proposed solution would be to just set the parameter default values to work for the ollama version (ie api_key="ollama" and base_url="http://localhost:11434/v1"). Then users can override these values. I think this would also allow users to explicitly set api_key / base_url to None and have the underlying OpenAI SDK still automatically pick up their env vars (without the risk of users accidentally incurring expenses).

Consensus: just pass the args through to the openai sdk. Don't do argument handling such as this.

jakelorocco · 2026-01-05T14:34:42Z

We should also review @jakelorocco 's notes in #141 before merging.

This issues fixes everything but the model id selection mentioned in #141. I'm ambivalent on if we actually want to handle that "smartly" or just make users choose the specific name themselves when we pick wrongly. We could develop a heuristic for it based on base_urls, but that seems prone to errors. If others agree, we can close #141 when this PR is merged.

…generative-computing/mellea into fix/270-openai-reasoning-effort

nrfulton · 2026-01-05T18:23:20Z

We should also review @jakelorocco 's notes in #141 before merging.

This issues fixes everything but the model id selection mentioned in #141. I'm ambivalent on if we actually want to handle that "smartly" or just make users choose the specific name themselves when we pick wrongly. We could develop a heuristic for it based on base_urls, but that seems prone to errors. If others agree, we can close #141 when this PR is merged.

Discussed in leads.

In the initializer, these lines of code need to change:
https://github.com/generative-computing/mellea/blob/e7e161bc14f4ba419d5bcf69954686cbe86ff75b/mellea/backends/openai.py#L144C1-L154C1

New logic:

If model_id : str then that self._model_id = model_id is used.
If model_id : ModelIdentifier then self._model_id = model_id.openai_name.
Delete all references to the self._hf_model_id field and replace with self._model_id.

Also ensure that this is dead code and then delete:

mellea/mellea/backends/openai.py

Lines 1009 to 1029 in e7e161b

    
           def apply_chat_template(self, chat: list[dict[str, str]]): 
        
               """Apply the chat template for the model, if such a model is available (e.g., when it can deduce the huggingface model id).""" 
        
               from transformers import AutoTokenizer 
        
               if not hasattr(self, "_tokenizer"): 
        
                   assert self._base_url, ( 
        
                       "The OpenAI Platform does not support adapters. You must specify a _base_url when using adapters." 
        
                   ) 
        
                   match _server_type(self._base_url): 
        
                       case _ServerType.LOCALHOST: 
        
                           self._tokenizer: "PreTrainedTokenizer" = (  # noqa: UP037 
        
                               AutoTokenizer.from_pretrained(self._hf_model_id) 
        
                           ) 
        
                       case _ServerType.OPENAI: 
        
                           raise Exception( 
        
                               "apply_chat_template is called while targeting a server at openai.com. " 
        
                               "This is not supported --- openai.com does not support Activated Lora. " 
        
                               "Use a locally served vllm instance. " 
        
                           ) 
        
               return self._tokenizer.apply_chat_template(chat, tokenize=False)

Resolution assigned to @avinash2692

avinash2692 · 2026-01-06T18:15:29Z

We should also review @jakelorocco 's notes in #141 before merging.

This issues fixes everything but the model id selection mentioned in #141. I'm ambivalent on if we actually want to handle that "smartly" or just make users choose the specific name themselves when we pick wrongly. We could develop a heuristic for it based on base_urls, but that seems prone to errors. If others agree, we can close #141 when this PR is merged.

Discussed in leads.

In the initializer, these lines of code need to change: https://github.com/generative-computing/mellea/blob/e7e161bc14f4ba419d5bcf69954686cbe86ff75b/mellea/backends/openai.py#L144C1-L154C1

New logic:
1. If `model_id : str` then that `self._model_id = model_id` is used.

2. If `model_id : ModelIdentifier` then `self._model_id = model_id.openai_name`.

3. Delete all references to the `self._hf_model_id` field and replace with `self._model_id`.
Also ensure that this is dead code and then delete:

mellea/mellea/backends/openai.py

Lines 1009 to 1029 in e7e161b

def apply_chat_template(self, chat: list[dict[str, str]]):

"""Apply the chat template for the model, if such a model is available (e.g., when it can deduce the huggingface model id)."""

from transformers import AutoTokenizer

if not hasattr(self, "_tokenizer"):

assert self._base_url, (

"The OpenAI Platform does not support adapters. You must specify a _base_url when using adapters."

)

match _server_type(self._base_url):

case _ServerType.LOCALHOST:

self._tokenizer: "PreTrainedTokenizer" = ( # noqa: UP037

AutoTokenizer.from_pretrained(self._hf_model_id)

)

case _ServerType.OPENAI:

raise Exception(

"apply_chat_template is called while targeting a server at openai.com. "

"This is not supported --- openai.com does not support Activated Lora. "

"Use a locally served vllm instance. "

)

return self._tokenizer.apply_chat_template(chat, tokenize=False)

Resolution assigned to @avinash2692

@nrfulton : the code should now have no mentions of hf_model_id. We still need to keep the apply_chat_template method cause it seems like its being used when running Alora adapters. For now, I've modified the method to make it work with the current code base, but we could remove it after some review of the adapters code base in the future.

jakelorocco · 2026-01-06T18:35:38Z

mellea/backends/openai.py

+            )
+            self._base_url: str | None = "http://localhost:11434/v1"  # ollama
            self._api_key = "ollama"


I thought the final verdict here was to not do any fancy handling and just pass args through as None.

ah, sorry missed this.

jakelorocco · 2026-01-06T18:39:43Z

mellea/backends/openai.py

    def apply_chat_template(self, chat: list[dict[str, str]]):
        """Apply the chat template for the model, if such a model is available (e.g., when it can deduce the huggingface model id)."""
        from transformers import AutoTokenizer

        if not hasattr(self, "_tokenizer"):
+            assert self._base_url, (
+                "The OpenAI Platform does not support adapters. You must specify a _base_url when using adapters."
+            )
            match _server_type(self._base_url):
                case _ServerType.LOCALHOST:
                    self._tokenizer: "PreTrainedTokenizer" = (  # noqa: UP037
-                        AutoTokenizer.from_pretrained(self._hf_model_id)
+                        AutoTokenizer.from_pretrained(self._model_id)
                    )
                case _ServerType.OPENAI:
                    raise Exception(


I might be missing something, but I don't see this function being utilized anywhere. I see other functions with the same name, but I don't see an OpenAIBackend.apply_chat_template anywhere.

I don't think this was explicitly being used anywhere. But seems like based on the ServerType code being set, Fred seems to have touched that part of the code .. so I assumed that he might be using it somewhere in his code base. I might have to run the adapters tests locally to test that out.

Happy to remove it for now and then figure out the repercussions later.

jakelorocco · 2026-01-06T19:58:36Z

mellea/backends/openai.py

+        self._api_key = api_key or os.getenv("OPENAI_API_KEY")
+        self._base_url = base_url or os.getenv("OPENAI_BASE_URL")

-        self._server_type = _server_type(self._base_url)
+        # Validate that we have the required configuration
+        if self._api_key is None:
+            raise ValueError(
+                "OPENAI_API_KEY is required but not set. Please either:\n"
+                "  1. Set the environment variable: export OPENAI_API_KEY='your-key-here'\n"
+                "  2. Pass it as a parameter: OpenAIBackend(api_key='your-key-here')"
+            )
+
+        if self._base_url is None:
+            raise ValueError(
+                "OPENAI_BASE_URL is required but not set. Please either:\n"
+                "  1. Set the environment variable: export OPENAI_BASE_URL=<your server url>\n"
+                "  2. Pass it as a parameter: OpenAIBackend(base_url=<your server url>)"
+            )


I was under the impression that we just wanted to let the openai sdk handle this env var grabbing. I think if the value is None, that's fine and we just pass it to the sdk.

I'm not sure if the openai sdk likes None being passed. Also, might be better to fail early there rather than folks ending up with a long call stack from the openai sdk.

But I think your comments about not grabbing the env vars makes sense. I'll just check for the existence of the env vars when we run the command.

…ions(I think)

…generative-computing/mellea into fix/270-openai-reasoning-effort

avinash2692 added 2 commits December 24, 2025 12:05

Adding a fix to pass reasoning_effort in conditionally

6e98053

adding tests

ab9e56b

avinash2692 linked an issue Dec 24, 2025 that may be closed by this pull request

OpenAI backend passes invalid reasoning_effort to non-reasoning models #270

Closed

avinash2692 changed the title ~~Fix OpenAI reasoning effort:~~ fix: OpenAI reasoning effort Dec 24, 2025

nrfulton self-requested a review December 26, 2025 20:16

nrfulton approved these changes Dec 26, 2025

View reviewed changes

Fixes #274

0862cf0

nrfulton requested review from HendrikStrobelt and jakelorocco December 26, 2025 22:40

nrfulton changed the title ~~fix: OpenAI reasoning effort~~ fix: OpenAI base_url default and reasoning effort model option. Dec 26, 2025

nrfulton added 5 commits December 26, 2025 17:05

Adds GPT 5.1 model identifier.

29481f7

Changes OpenAI Backend default model_id to GPT 5.1.

1426ee9

This default is changed because the default base_url is also changed.

Adds inline documentation for OpenAI model options monkey patching.

0bde6ec

removes debug print stmt.

4d87c83

nrfulton linked an issue Jan 3, 2026 that may be closed by this pull request

investigate defaults for openai backend #141

Closed

nrfulton self-requested a review January 3, 2026 16:07

jakelorocco reviewed Jan 5, 2026

View reviewed changes

avinash2692 added 2 commits January 5, 2026 08:46

adding a comment about reasoning_effort in openai sdk

f87f86b

Merge branch 'fix/270-openai-reasoning-effort' of https://github.com/…

e7e161b

…generative-computing/mellea into fix/270-openai-reasoning-effort

avinash2692 and others added 2 commits January 6, 2026 09:01

Merge branch 'main' into fix/270-openai-reasoning-effort

b6d16a6

removing all instances of hf_model_id in openai backend

a94205d

jakelorocco reviewed Jan 6, 2026

View reviewed changes

removing apply_chat_template and adding assertions for env variable

1e7c1b4

adding some tests for param checking

a695cb4

jakelorocco reviewed Jan 6, 2026

View reviewed changes

avinash2692 and others added 3 commits January 6, 2026 12:27

changing env variable handling logic.

41a0c62

base_url check is now a warning

c905843

fix: change warning message in openai.py

0a7747a

jakelorocco approved these changes Jan 6, 2026

View reviewed changes

avinash2692 added 2 commits January 6, 2026 13:12

marking test as qualitative cause it's causing timeouts in github act…

d0ecfc7

…ions(I think)

Merge branch 'fix/270-openai-reasoning-effort' of https://github.com/…

17c2862

…generative-computing/mellea into fix/270-openai-reasoning-effort

avinash2692 merged commit 9733df8 into main Jan 6, 2026
4 checks passed

fix: OpenAI base_url default and reasoning effort model option. #271

fix: OpenAI base_url default and reasoning effort model option. #271

Conversation

avinash2692 commented Dec 24, 2025 • edited by nrfulton Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented Dec 24, 2025

Merge Protections

🟢 Enforce conventional commit

Uh oh!

nrfulton left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nrfulton commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nrfulton commented Jan 3, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jakelorocco commented Jan 5, 2026

Uh oh!

nrfulton commented Jan 5, 2026

Uh oh!

avinash2692 commented Jan 6, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: OpenAI `base_url` default and reasoning effort model option. #271

fix: OpenAI `base_url` default and reasoning effort model option. #271

avinash2692 commented Dec 24, 2025 •

edited by nrfulton

Loading

nrfulton left a comment •

edited

Loading

nrfulton commented Dec 26, 2025 •

edited

Loading