Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions docs/api/lemonade.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ We have designed a set of Lemonade-specific endpoints to enable client applicati
| `POST` | [`/v1/delete`](#post-v1delete) | Delete a model |
| `POST` | [`/v1/load`](#post-v1load) | Load a model |
| `POST` | [`/v1/unload`](#post-v1unload) | Unload a model |
| `GET` | [`/v1/models/{id}/files`](#get-v1modelsidfiles) | List resolved local file metadata for one model |
| `GET` | [`/v1/health`](#get-v1health) | Check server status, such as models loaded |
| `GET` | [`/v1/stats`](#get-v1stats) | Performance statistics from the last request |
| `GET` | [`/v1/system-stats`](#get-v1system-stats) | Current host resource usage |
Expand All @@ -29,6 +30,78 @@ We have designed a set of Lemonade-specific endpoints to enable client applicati
| `GET` | [`/metrics`](#get-metrics) | Prometheus metrics scrape endpoint |
| `POST` | [`/internal/telemetry/flush`](#post-internaltelemetryflush) | Force-flush all queued telemetry trace spans |

## `GET /v1/models/{id}/files`
<sub>![Status](https://img.shields.io/badge/status-fully_available-green)</sub>

List resolved local file metadata for a single model. This endpoint is intended for model-detail UIs such as the Files tab. It is per-model inventory, not system or drive storage accounting.

The endpoint is available at:

- `/v1/models/{id}/files`
- `/api/v1/models/{id}/files`
- `/v0/models/{id}/files`
- `/api/v0/models/{id}/files`

By default, the response does not include absolute filesystem paths. Trusted local clients that need paths for native UI actions can request them explicitly with `?include_paths=true`. Absolute paths may reveal local usernames and cache layout, so clients should only request them when that disclosure is acceptable.

### Example request

```bash
curl http://localhost:13305/v1/models/Qwen3-4B/files
```

### Response format

```json
{
"model_id": "Qwen3-4B",
"files": [
{
"name": "model.gguf",
"role": "main",
"size_bytes": 123456789,
"exists": true
},
{
"name": "mmproj.gguf",
"role": "mmproj",
"size_bytes": 12345678,
"exists": true
}
]
}
```

### Optional path disclosure

```bash
curl 'http://localhost:13305/v1/models/Qwen3-4B/files?include_paths=true'
```

When `include_paths=true` is supplied, each file entry also includes `path`:

```json
{
"name": "model.gguf",
"path": "/abs/path/model.gguf",
"role": "main",
"size_bytes": 123456789,
"exists": true
}
```

### Fields

| Field | Description |
|-------|-------------|
| `model_id` | Public model ID for the requested model. |
| `files` | Array of resolved model files known to the registry. |
| `files[].name` | Base filename from the resolved path. |
| `files[].path` | Absolute resolved path on the local system. Only included when `include_paths=true`; privacy-sensitive. |
| `files[].role` | Checkpoint role, for example `main`, `mmproj`, or another recipe-specific role. |
| `files[].size_bytes` | File size in bytes. Directories are summed recursively. Missing files report `0`. |
| `files[].exists` | Whether the resolved path currently exists on disk. |

## `POST /v1/pull`
<sub>![Status](https://img.shields.io/badge/status-fully_available-green)</sub>

Expand Down
12 changes: 12 additions & 0 deletions src/cpp/include/lemon/model_manager.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#pragma once

#include <stdexcept>
#include <cstdint>
#include <string>
#include <map>
#include <optional>
Expand Down Expand Up @@ -126,6 +127,14 @@ struct ModelInfo {
std::string mmproj() const { return checkpoint("mmproj"); }
};

struct ModelFileInfo {
std::string name;
std::string path;
std::string role;
std::uint64_t size_bytes = 0;
bool exists = false;
};

class CloudProviderRegistry;

class ModelManager {
Expand Down Expand Up @@ -190,6 +199,9 @@ class ModelManager {
// Get model info by name
ModelInfo get_model_info(const std::string& model_name);

// Get per-model file inventory for the Files tab.
std::vector<ModelFileInfo> list_model_files(const std::string& model_name);

// Resolve a public model reference to its canonical internal name.
std::string resolve_model_name(const std::string& model_name);

Expand Down
1 change: 1 addition & 0 deletions src/cpp/include/lemon/server.h
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ class Server {
void handle_live(const httplib::Request& req, httplib::Response& res);
void handle_models(const httplib::Request& req, httplib::Response& res);
void handle_model_by_id(const httplib::Request& req, httplib::Response& res);
void handle_model_files(const httplib::Request& req, httplib::Response& res);
void handle_chat_completions(const httplib::Request& req, httplib::Response& res);
// Server-side tool-calling orchestration for Omni "collection" models.
void handle_collection_chat_completions(const nlohmann::json& request_json,
Expand Down
28 changes: 28 additions & 0 deletions src/cpp/server/model_manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,34 @@ static uintmax_t resolved_path_size_bytes(const fs::path& path) {
return total;
}


std::vector<ModelFileInfo> ModelManager::list_model_files(const std::string& model_name) {
ModelInfo info = get_model_info(model_name);
std::vector<ModelFileInfo> files;
files.reserve(info.resolved_paths.size());

for (const auto& [role, resolved_path] : info.resolved_paths) {
if (resolved_path.empty()) {
continue;
}

fs::path path = path_from_utf8(resolved_path);
const bool path_exists = safe_exists(path);

ModelFileInfo file;
file.name = path_to_utf8(path.filename());
file.path = resolved_path;
file.role = role;
file.exists = path_exists;
file.size_bytes = path_exists
? static_cast<std::uint64_t>(resolved_path_size_bytes(path))
: 0;
files.push_back(std::move(file));
}

return files;
}

static void cleanup_orphaned_blobs_under(const fs::path& path,
const fs::path& models_dir) {
if (!safe_exists(path)) {
Expand Down
59 changes: 59 additions & 0 deletions src/cpp/server/server.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -614,6 +614,21 @@ void Server::setup_routes(httplib::Server &web_server) {
handle_models(req, res);
});

// Model files endpoint for the Files tab. Register before the generic
// /models/(.+) route so '<model-id>/files' is not parsed as the model ID.
web_server.Get(R"(/api/v0/models/(.+)/files)", [this](const httplib::Request& req, httplib::Response& res) {
handle_model_files(req, res);
});
web_server.Get(R"(/api/v1/models/(.+)/files)", [this](const httplib::Request& req, httplib::Response& res) {
handle_model_files(req, res);
});
web_server.Get(R"(/v0/models/(.+)/files)", [this](const httplib::Request& req, httplib::Response& res) {
handle_model_files(req, res);
});
web_server.Get(R"(/v1/models/(.+)/files)", [this](const httplib::Request& req, httplib::Response& res) {
handle_model_files(req, res);
});

// Model by ID (need to register for both versions with regex, with and without /api prefix)
web_server.Get(R"(/api/v0/models/(.+))", [this](const httplib::Request& req, httplib::Response& res) {
handle_model_by_id(req, res);
Expand Down Expand Up @@ -2084,6 +2099,50 @@ void Server::handle_model_by_id(const httplib::Request& req, httplib::Response&
}
}

void Server::handle_model_files(const httplib::Request& req, httplib::Response& res) {
std::string model_id = req.matches[1];
const bool include_paths = req.has_param("include_paths") &&
req.get_param_value("include_paths") == "true";

try {
if (!model_manager_->model_exists(model_id)) {
res.status = 404;
auto error_response = create_model_error(model_id, "Model not found");
res.set_content(error_response.dump(), "application/json");
return;
}

std::string canonical_cache_key = model_manager_->resolve_model_name(model_id);
std::string wire_id = model_manager_->get_public_model_name(canonical_cache_key);
auto files = model_manager_->list_model_files(model_id);

nlohmann::json response;
response["model_id"] = wire_id;
response["files"] = nlohmann::json::array();

for (const auto& file : files) {
nlohmann::json file_json = {
{"name", file.name},
{"role", file.role},
{"size_bytes", file.size_bytes},
{"exists", file.exists}
};

if (include_paths) {
file_json["path"] = file.path;
}

response["files"].push_back(std::move(file_json));
}

res.set_content(response.dump(), "application/json");
} catch (const std::exception&) {
res.status = 404;
auto error_response = create_model_error(model_id, "Model not found");
res.set_content(error_response.dump(), "application/json");
}
}

void Server::handle_collection_chat_completions(const nlohmann::json& request_json,
const ModelInfo& collection_info,
httplib::Response& res) {
Expand Down
Loading