lmstudio-ai · mattjcly · Oct 30, 2025 · Oct 30, 2025 · Nov 7, 2025 · Nov 7, 2025
diff --git a/1_developer/0_core/0_server/settings.md b/1_developer/0_core/0_server/settings.md
@@ -8,7 +8,7 @@ index: 2
 
 You can configure server settings, such as the port number, whether to allow other API clients to access the server and MCP features.
 
-<img src="/assets/docs/server-config.png" style="" data-caption="Configure LM Studio API Server settings" />
+<img src="/assets/docs/server-settings.png" style="" data-caption="Configure LM Studio API Server settings" />
 
 
 ### Settings information
@@ -19,13 +19,21 @@ You can configure server settings, such as the port number, whether to allow oth
   optional: false
   description: Port number on which the LM Studio API server listens for incoming connections.
   unstyledName: true
+- name: Require Authentication
+  type: Switch
+  description: Require API clients to provide a valid API token via the `Authorization` header. Learn more in the [Authentication](/docs/developer/core/authentication) section.
+  unstyledName: true
 - name: Serve on Local Network
   type: Switch
   description: Allow other devices on the same local network to access the API server. Learn more in the [Serve on Local Network](/docs/developer/core/server/serve-on-network) section.
   unstyledName: true
-- name: Allow Per Request Remote MCPs
+- name: Allow per-request MCPs
+  type: Switch
+  description: Allow API clients to use MCP (Model Control Protocol) servers that are not in your mcp.json. These MCP connections are ephemeral, only existing as long as the request. At the moment, only remote MCPs are supported.
+  unstyledName: true
+- name: Allow calling servers from mcp.json
   type: Switch
-  description: Enable sending requests to remote MCP (Model Control Protocol) servers on a per-request basis.
+  description: Allow API clients to use servers you defined in your mcp.json in LM Studio. This can be a security risk if you've defined MCP servers that have access to your file system or private data. This option requires "Require Authentication" to be enabled.
   unstyledName: true
 - name: Enable CORS
   type: Switch

diff --git a/1_developer/0_core/_authentication.md → 1_developer/0_core/authentication.md b/1_developer/0_core/_authentication.md → 1_developer/0_core/authentication.md
diff --git a/1_developer/0_core/headless.md b/1_developer/0_core/headless.md
@@ -23,26 +23,25 @@ To enable this, head to app settings (`Cmd` / `Ctrl` + `,`) and check the box to
 
 When this setting is enabled, exiting the app will minimize it to the system tray, and the LLM server will continue to run in the background.
 
-## Just-In-Time (JIT) model loading for OpenAI endpoints
+## Just-In-Time (JIT) model loading for REST endpoints
 
 Useful when utilizing LM Studio as an LLM service with other frontends or applications.
 
 <img src="/assets/docs/jit-loading.png" style="" data-caption="Load models on demand" />
 
 #### When JIT loading is ON:
 
-- Call to `/v1/models` will return all downloaded models, not only the ones loaded into memory
+- Calls to OpenAI-compatible `/v1/models` will return all downloaded models, not only the ones loaded into memory
 - Calls to inference endpoints will load the model into memory if it's not already loaded
 
 #### When JIT loading is OFF:
 
-- Call to `/v1/models` will return only the models loaded into memory
+- Calls to OpenAI-compatible `/v1/models` will return only the models loaded into memory
 - You have to first load the model into memory before being able to use it
 
-##### What about auto unloading?
+#### What about auto unloading?
 
-As of LM Studio 0.3.5, auto unloading is not yet in place. Models that are loaded via JIT loading will remain in memory until you unload them.
-We expect to implement more sophisticated memory management in the near future. Let us know if you have any feedback or suggestions.
+JIT loaded models will be auto-unloaded from memory by default after a set period of inactivity ([learn more](/docs/developer/core/ttl-and-auto-evict)).
 
 ## Auto Server Start
 

diff --git a/1_developer/0_core/_mcp.md → 1_developer/0_core/mcp.md b/1_developer/0_core/_mcp.md → 1_developer/0_core/mcp.md
@@ -2,10 +2,12 @@
 title: Using MCP via API
 sidebar_title: Using MCP via API
 description: Learn how to use Model Control Protocol (MCP) servers with LM Studio API.
-index: 4
+index: 3
 ---
 
-LM Studio supports Model Control Protocol (MCP) usage via API starting from version 0.4.0. MCP allows models to interact with external tools and services through standardized servers.
+##### Requires [LM Studio 0.4.0](/download) or newer.
+
+LM Studio supports Model Control Protocol (MCP) usage via API. MCP allows models to interact with external tools and services through standardized servers.
 
 ## How it works
 

diff --git a/1_developer/0_core/ttl-and-auto-evict.md b/1_developer/0_core/ttl-and-auto-evict.md
@@ -1,7 +1,7 @@
 ---
 title: Idle TTL and Auto-Evict
 description: Optionally auto-unload idle models after a certain amount of time (TTL)
-index: 1
+index: 4
 ---
 
 ## Background

diff --git a/1_developer/2_rest/_chat.md → 1_developer/2_rest/chat.md b/1_developer/2_rest/_chat.md → 1_developer/2_rest/chat.md
diff --git a/1_developer/2_rest/_download-status.md → 1_developer/2_rest/download-status.md b/1_developer/2_rest/_download-status.md → 1_developer/2_rest/download-status.md
diff --git a/1_developer/2_rest/_download.md → 1_developer/2_rest/download.md b/1_developer/2_rest/_download.md → 1_developer/2_rest/download.md
diff --git a/1_developer/2_rest/endpoints.md b/1_developer/2_rest/endpoints.md
@@ -1,8 +1,12 @@
 ---
-title: REST API v0
+title: REST API v0 (deprecated)
 description: "The REST API includes enhanced stats such as Token / Second and Time To First Token (TTFT), as well as rich information about models such as loaded vs unloaded, max context, quantization, and more."
 ---
 
+```lms_warning
+LM Studio now has a [v1 REST API](/docs/developer/rest)! Please migrate to the new API.
+```
+
 ##### Requires [LM Studio 0.3.6](/download) or newer.
 
 LM Studio now has its own REST API, in addition to OpenAI compatibility mode ([learn more](/docs/developer/openai-compat)).
@@ -40,7 +44,7 @@ List all loaded and downloaded models
 **Example request**
 
 ```bash
-curl http://localhost:1234/api/v0/models
+curl -H "Authorization: Bearer $LM_API_TOKEN" http://localhost:1234/api/v0/models
 ```
 
 **Response format**
@@ -95,7 +99,7 @@ Get info about one specific model
 **Example request**
 
 ```bash
-curl http://localhost:1234/api/v0/models/qwen2-vl-7b-instruct
+curl -H "Authorization: Bearer $LM_API_TOKEN" http://localhost:1234/api/v0/models/qwen2-vl-7b-instruct
 ```
 
 **Response format**
@@ -124,6 +128,7 @@ Chat Completions API. You provide a messages array and receive the next assistan
 
 ```bash
 curl http://localhost:1234/api/v0/chat/completions \
+  -H "Authorization: Bearer $LM_API_TOKEN" \
   -H "Content-Type: application/json" \
   -d '{
     "model": "granite-3.0-2b-instruct",
@@ -191,6 +196,7 @@ Text Completions API. You provide a prompt and receive a completion.
 
 ```bash
 curl http://localhost:1234/api/v0/completions \
+  -H "Authorization: Bearer $LM_API_TOKEN" \
   -H "Content-Type: application/json" \
   -d '{
     "model": "granite-3.0-2b-instruct",
@@ -253,6 +259,7 @@ Text Embeddings API. You provide a text and a representation of the text as an e
 
 ```bash
 curl http://localhost:1234/api/v0/embeddings \
+  -H "Authorization: Bearer $LM_API_TOKEN" \
   -H "Content-Type: application/json" \
   -d '{
     "model": "text-embedding-nomic-embed-text-v1.5",

diff --git a/1_developer/2_rest/_index.md → 1_developer/2_rest/index.md b/1_developer/2_rest/_index.md → 1_developer/2_rest/index.md
diff --git a/1_developer/2_rest/_list.md → 1_developer/2_rest/list.md b/1_developer/2_rest/_list.md → 1_developer/2_rest/list.md
diff --git a/1_developer/2_rest/_load.md → 1_developer/2_rest/load.md b/1_developer/2_rest/_load.md → 1_developer/2_rest/load.md
diff --git a/1_developer/2_rest/_quickstart.md → 1_developer/2_rest/quickstart.md b/1_developer/2_rest/_quickstart.md → 1_developer/2_rest/quickstart.md
diff --git a/1_developer/2_rest/_stateful-chats.md → 1_developer/2_rest/stateful-chats.md b/1_developer/2_rest/_stateful-chats.md → 1_developer/2_rest/stateful-chats.md
diff --git a/1_developer/2_rest/_streaming-events.md → 1_developer/2_rest/streaming-events.md b/1_developer/2_rest/_streaming-events.md → 1_developer/2_rest/streaming-events.md
diff --git a/1_developer/api-changelog.md b/1_developer/api-changelog.md
@@ -6,6 +6,19 @@ index: 2
 
 ---
 
+###### LM Studio 0.4.0
+
+### LM Studio native v1 REST API
+
+- Official release of LM Studio's native v1 REST API at `/api/v1/*` endpoints.
+  - [MCP via API](/docs/developer/core/mcp)
+  - [Stateful chats](/docs/developer/rest/stateful-chats)
+  - [Authentication](/docs/developer/core/authentication) configuration with API tokens
+  - Model [download](/docs/developer/rest/download) and [load](/docs/developer/rest/load) endpoints
+  - See [overview](/docs/developer/rest) page for more details and [comparison](/docs/developer/rest#inference-endpoint-comparison) with OpenAI-compatible endpoints.
+
+---
+
 ###### LM Studio 0.3.29 • 2025‑10‑06
 
 ### OpenAI `/v1/responses` and variant listing

diff --git a/1_developer/index.md b/1_developer/index.md
@@ -10,6 +10,7 @@ index: 1
 
 - TypeScript SDK: [lmstudio-js](/docs/typescript)
 - Python SDK: [lmstudio-python](/docs/python)
+- LM Studio REST API: [Stateful Chats, MCPs via API](/docs/developer/rest)
 - OpenAI‑compatible: [Chat, Responses, Embeddings](/docs/developer/openai-compat)
 - LM Studio CLI: [`lms`](/docs/cli)
 
@@ -18,10 +19,10 @@ index: 1
 ## What you can build
 
 - Chat and text generation with streaming
+- Tool calling and local agents with MCP
 - Structured output (JSON schema)
-- Tool calling and local agents
 - Embeddings and tokenization
-- Model management (JIT load, TTL, auto‑evict)
+- Model management (load, download, list)
 ```
 
 ## Super quick start
@@ -61,26 +62,27 @@ with lms.Client() as client:
 
 Full docs: [lmstudio-python](/docs/python), Source: [GitHub](https://github.com/lmstudio-ai/lmstudio-python)
 
-### Try a minimal HTTP request (OpenAI‑compatible)
+### HTTP (LM Studio REST API)
 
 ```bash
 lms server start --port 1234
 ```
 
 ```bash
-curl http://localhost:1234/v1/chat/completions \
+curl http://localhost:1234/api/v1/chat \
   -H "Content-Type: application/json" \
+  -H "Authorization: Bearer $LM_API_TOKEN" \
   -d '{
     "model": "openai/gpt-oss-20b",
-    "messages": [{"role": "user", "content": "Who are you, and what can you do?"}]
+    "input": "Who are you, and what can you do?"
   }'
 ```
 
-Full docs: [OpenAI‑compatible endpoints](/docs/developer/openai-compat)
+Full docs: [LM Studio REST API](/docs/developer/rest)
 
 ## Helpful links
 
-- API Changelog: [/docs/developer/api-changelog](/docs/developer/api-changelog)
-- Local server basics: [/docs/developer/core](/docs/developer/core)
-- CLI reference: [/docs/cli](/docs/cli)
-- Community: [Discord](https://discord.gg/lmstudio)
+- [API Changelog](/docs/developer/api-changelog)
+- [Local server basics](/docs/developer/core)
+- [CLI reference](/docs/cli)
+- [Discord Community](https://discord.gg/lmstudio)
diff --git a/...thon/1_getting-started/_authentication.md → 1_python/1_getting-started/authentication.md b/...thon/1_getting-started/_authentication.md → 1_python/1_getting-started/authentication.md
diff --git a/2_typescript/_authentication.md → 2_typescript/authentication.md b/2_typescript/_authentication.md → 2_typescript/authentication.md