Skip to content

Enhancement: Discover and apply per-model context window instead of using a global LEMONADE_CTX_SIZE #12

Description

@fl0rianr

Summary

The VS Code extension currently exposes the same context window to VS Code Copilot for all Lemonade models, based on the global LEMONADE_CTX_SIZE environment variable or a fixed fallback.

This works as a basic workaround, but it is not ideal for a multi-model / hybrid-model setup where different Lemonade models can have very different usable context sizes. The extension should discover the context window per model, report it accurately to VS Code.

Motivation

Lemonade can expose multiple models through the VS Code Copilot Language Model Provider integration.
It should be an easy for a first UX.
But often a different context size that the default 128000 is set inside of Lemonde.
At this could differ individually form model to model.
That way a chat starts without a problem but errors out at some point.

Proposed behavior

E.g. context size per model should be resolved and report to VS Code through the Language Model Chat Provider metadata:

  1. Explicit VS Code user setting override, for example:

    {
      "lemonade.modelContextSizes": {
        "Qwen3-Coder-Next-GGUF": 184320,
        "Gemma-4-26B-A4B-it-GGUF": 262144
      }
    }
  2. Model-specific metadata from Lemonade, for example context size etc.

  3. Global LEMONADE_CTX_SIZE as backward-compatible fallback

  4. Existing hardcoded default as final fallback

Additionally further informations like maxOutputTokens and maxInputTokens should be shared with VS Code aw well.

I'm happy to expand those ideas further or help with implementing it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions