Is it a request payload issue?
[x] Yes, this is a request payload issue. I am using a client/cURL to send a request payload, but I received an unexpected error.
[ ] No, it's another issue.
If it's a request payload issue, you MUST know
Our team doesn't have any GODs or ORACLEs or MIND READERs. Please make sure to attach the request log or curl payload.
Describe the bug
When using GitHub Models+ as an OpenAI-compatible provider (Base URL: https://models.inference.ai.azure.com), any newer model triggers the following 400 error:
400 Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.
The frontend/client is still sending the legacy max_tokens parameter in the request payload, but GitHub Models (Azure backend) has deprecated it for o1 series, GPT-5 and all newer models. It now strictly requires max_completion_tokens.
CLI Type
openai-compatibility
Model Name
o4-mini、o3、gpt-5、gpt-5-mini、gpt-5-nano
LLM Client
Management WebUI connect test.
Request Information
.
Expected behavior
The OpenAI-compatible layer should automatically convert max_tokens → max_completion_tokens for GitHub Models / Azure endpoints (just like LiteLLM does), or allow per-model parameter override in the provider settings.
OS Type
OS: render
Version: v6.8.51
Additional context
This issue started appearing after GitHub Models updated their newer model deployments. Many OpenAI-compatible frontends are affected. Temporary workaround is to use LiteLLM as a proxy layer (which auto-maps the parameter).