Ryzen AI SD Server

A lightweight Stable Diffusion inference server for AMD Ryzen AI NPUs using ONNX Runtime, exposing an OpenAI-compatible REST API.

Overview

This server loads a Stable Diffusion model from an ONNX model directory, auto-detects its variant, and runs the full txt2img / img2img / ControlNet pipeline on AMD Ryzen AI NPU hardware.

Key Features:

OpenAI API Compatible: /v1/images/generations, /v1/images/edits, /v1/images/variations
Multi-Architecture: SD 1.5, SDXL, SD3, SD3.5 with auto-detection
ControlNet Support: Canny, Pose, Tile, Depth, and more
CLI & Server Modes: One-shot generation or persistent HTTP server
Minimal Dependencies: Single executable + DLLs
Self-Registering Variants: Add new model architectures without changing core pipeline

Supported Models

Model	Variant	Default Resolution	Steps	Guidance
SD 1.5	`sd15`	512×512	20	7.5
SD Turbo	`sd15` (turbo)	512×512	1	0.0
SDXL Base 1.0	`sdxl`	1024×1024	20	7.5
SDXL Turbo	`sdxl` (turbo)	512×512	1	0.0
SD3 Medium	`sd3`	1024×1024	28	7.0
SD3.5 Medium	`sd35`	1024×1024	28	4.5

The variant is auto-detected from the model directory structure.

Building from Source

Prerequisites

Windows Requirements:

Windows 11 (64-bit)
Visual Studio 2022
CMake 3.20 or higher
ONNX Runtime library (onnxruntime.lib / onnxruntime.dll)
- Typically from Ryzen AI SDK at C:\Program Files\RyzenAI\1.7.1\onnxruntime\lib
- ONNX Runtime headers are already vendored in the repo — no SDK needed for compilation

Hardware Requirements:

AMD Ryzen AI processor (for NPU execution)
Minimum 16GB RAM (32GB recommended)

Build Steps (Windows)

# Clone the repository
git clone https://github.com/lemonade-sdk/ryzenai-sd-server.git
cd ryzenai-sd-server

# Create and enter build directory
mkdir build
cd build

# Configure with CMake (uses default ORT lib path)
cmake .. -G "Visual Studio 17 2022" -A x64

# Build
cmake --build . --config Release

Build Steps (Linux)

Linux Requirements:

Ubuntu 22.04+ or equivalent
GCC 9+ or Clang 10+
CMake 3.20 or higher
ONNX Runtime shared library (libonnxruntime.so)

# Clone the repository
git clone https://github.com/lemonade-sdk/ryzenai-sd-server.git
cd ryzenai-sd-server

# Create and enter build directory
mkdir build
cd build

# Configure with CMake
cmake .. -DCMAKE_BUILD_TYPE=Release

# Build
cmake --build .

Build Output

Windows: The executable and required DLLs will be created at:

build\bin\Release\ryzenai-sd-server.exe

Linux: The executable and required shared libraries will be created at:

build/bin/ryzenai-sd-server

ONNX Runtime DLLs are automatically copied from the lib directory to the output directory during build.

Custom ONNX Runtime Path

If the ONNX Runtime library is in a non-default location:

# Windows — point at the directory containing onnxruntime.lib
cmake .. -G "Visual Studio 17 2022" -A x64 ^
  -DONNXRUNTIME_LIB_DIR="C:\custom\path\onnxruntime\lib"
cmake --build . --config Release

# Linux
cmake .. -DCMAKE_BUILD_TYPE=Release \
  -DONNXRUNTIME_LIB_DIR=/custom/path/onnxruntime/lib
cmake --build .

Deploy Extra Runtime DLLs

If you have additional DLLs to ship alongside the executable (e.g., RyzenAI custom ops, DynamicDispatch runtime):

cmake .. -G "Visual Studio 17 2022" -A x64 ^
  -DRUNTIME_DLLS_DIR="C:\Program Files\RyzenAI\1.7.1\deployment"
cmake --build . --config Release

To deploy a full directory tree (preserving subdirectory structure):

cmake .. -G "Visual Studio 17 2022" -A x64 ^
  -DLIB_DIRECTORY="C:\Program Files\RyzenAI\1.7.1\GenAI-SD\lib"
cmake --build . --config Release

Code Structure

ryzenai-sd-server/
├── CMakeLists.txt              # Build configuration
│
├── src/                        # Source files
│   ├── main.cpp                # Entry point, CLI arg parsing
│   ├── http_server.cpp         # HTTP server (cpp-httplib)
│   ├── sd_pipeline.cpp         # Core pipeline: encode → denoise → decode
│   ├── onnx_model.cpp          # ONNX Runtime session wrapper
│   ├── scheduler.cpp           # Noise schedulers (Euler, PNDM, etc.)
│   ├── clip_tokenizer.cpp      # CLIP text tokenizer
│   ├── config_loader.cpp       # Configuration file loader
│   ├── controlnet_runner.cpp   # ControlNet model runner
│   ├── variant_registry.cpp    # Variant registration system
│   └── variants/               # Per-architecture implementations
│       ├── sd15_variant.cpp    # SD 1.5 registration + detection
│       ├── sdxl_variant.cpp    # SDXL registration + detection
│       ├── sd3_variant.cpp     # SD3/SD3.5 registration + detection
│       └── *_text_encoder.cpp / *_denoiser.cpp / *_vae_decoder.cpp
│
├── include/sd/                 # Headers
│   ├── sd_types.h              # Common data structures
│   ├── sd_model.h              # Model configuration
│   ├── sd_pipeline.h           # Pipeline interface
│   ├── onnx_model.h            # ONNX session wrapper
│   ├── scheduler.h             # Scheduler interface
│   ├── clip_tokenizer.h        # Tokenizer interface
│   ├── http_server.h           # HTTP server interface
│   ├── config_loader.h         # Config loader interface
│   ├── controlnet_runner.h     # ControlNet interface
│   ├── variant_factory.h       # Variant factory types
│   ├── variant_registry.h      # Variant registry
│   ├── i_text_encoder.h        # Text encoder interface
│   ├── i_denoiser.h            # Denoiser interface
│   ├── i_vae_decoder.h         # VAE decoder interface
│   └── variants/               # Variant-specific headers
│       ├── sd15_*.h
│       ├── sdxl_*.h
│       └── sd3_*.h
│
├── include/onnxruntime/        # Vendored ONNX Runtime headers (no SDK needed)
│   ├── onnxruntime_cxx_api.h
│   ├── onnxruntime_c_api.h
│   └── ...
│
├── test/                       # Test scripts
│   ├── test_server.py          # Server test runner
│   ├── models.json             # Model test configurations
│   └── requirements.txt        # Python test dependencies
│
└── external/                   # Header-only dependencies
    └── cpp-httplib/            # HTTP server (auto-downloaded)

Architecture Overview

Design Principles

Simplicity: Single executable with no external runtime dependencies beyond ONNX Runtime
RAII: Resource management follows C++ best practices with smart pointers
Thread Safety: Pipeline guarded by std::mutex, configs cloned per-request
Convention Over Configuration: Variant auto-detected from model directory layout
Extensible: Self-registering variant system — add models without changing core code

Component Layers

┌─────────────────────────────────────────────────┐
│         HTTP Server (cpp-httplib)               │
│         OpenAI API Endpoints                    │
├─────────────────────────────────────────────────┤
│         SD Pipeline                             │
│         encode → denoise → decode               │
├─────────────────────────────────────────────────┤
│         Variant System                          │
│         SD1.5 / SDXL / SD3 / SD3.5             │
├─────────────────────────────────────────────────┤
│         ONNX Runtime + RyzenAI Custom Ops       │
│         NPU / CPU Execution                     │
└─────────────────────────────────────────────────┘

Dependencies

Vendored in the repo (no download/install needed):

ONNX Runtime headers (1.23.3) - Vendored in include/onnxruntime/
cpp-httplib (v0.26.0) - HTTP server, auto-downloaded by CMake [MIT License]

Required externally (link library + runtime DLLs only):

ONNX Runtime - onnxruntime.lib / onnxruntime.dll (from RyzenAI SDK or standalone)

Usage

CLI Mode (one-shot generation)

# Generate with auto-detected variant
ryzenai-sd-server.exe -m C:\path\to\onnx\model -p "A sunset over mountains"

# With custom parameters
ryzenai-sd-server.exe -m C:\models\sdxl -p "A serene lake" -n 20 -g 7.5 -W 1024 -H 1024

Server Mode (HTTP API)

# Start the server
ryzenai-sd-server.exe --server -m C:\path\to\onnx\model --port 8080

Command-Line Arguments

-m, --model-path PATH - Path to ONNX model directory (required)
--server - Start HTTP server instead of single generation
--port PORT - Server port (default: 8080)
-p, --prompt TEXT - Generation prompt
-n, --num-inference-steps N - Denoising steps (auto per variant)
-g, --guidance-scale FLOAT - CFG scale (auto per variant)
-W WIDTH / -H HEIGHT - Image dimensions
-s, --seed INT - Random seed (default: 0)
-v, --variant NAME - Model variant: sd15, sdxl, sd3, sd35 (auto-detected)
-C, --controlnet TYPE - ControlNet type: Canny, Pose, Tile, Depth, etc.
--force-cpu - Force CPU execution provider
-h, --help - Show help message

API Endpoints

The server implements OpenAI-compatible API endpoints.

Health Check

GET /health

Returns server status.

Image Generation

POST /v1/images/generations - Text-to-image generation
POST /v1/images/edits - Image-to-image + ControlNet
POST /v1/images/variations - Image variations

All endpoints return responses in OpenAI Images API format.

Testing

cd test
pip install -r requirements.txt

# Test against a running server
python test_server.py txt2img --url http://localhost:8080

# Auto-launch server for one model
python test_server.py txt2img --model-path C:\path\to\model

# Test all models
python test_server.py txt2img --all-models

Integration with Lemonade Server

This server is designed to be used as a backend for Lemonade Server. When running Lemonade Server, the ryzenai-sd-server executable is automatically downloaded from GitHub releases and managed by the Lemonade Router.

Integration Examples

Python with requests

import requests
import base64
from PIL import Image
import numpy as np

response = requests.post(
    "http://localhost:8080/v1/images/generations",
    json={
        "prompt": "A red cat sitting on a windowsill",
        "size": "512x512",
        "n": 1,
        "response_format": "b64_json"
    }
).json()

raw = base64.b64decode(response["data"][0]["b64_json"])
img = Image.frombytes("RGB", (response["width"], response["height"]), raw)
img.save("output.png")

Development

Code Style

C++17 standard
RAII for resource management
Smart pointers (no raw pointers)
Const correctness

Building for Development

Windows:

cmake --build . --config Debug

Debug executable location: build\bin\Debug\ryzenai-sd-server.exe

Linux:

cmake .. -DCMAKE_BUILD_TYPE=Debug
cmake --build .

Debug executable location: build/bin/ryzenai-sd-server

Related Projects

Ryzen AI Documentation
ONNX Runtime
Lemonade Server - Parent project providing model orchestration
Ryzen AI LLM Server - Companion LLM server

License

This project's source code is licensed under the MIT License - see LICENSE for details.

Release Artifacts (ryzenai-sd-server.zip):

The ryzenai-sd-server binary and the header-only dependencies (cpp-httplib) are MIT licensed
The Ryzen AI DLLs included in binary releases are licensed under the AMD Software End User License Agreement

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
include		include
src		src
test		test
.gitignore		.gitignore
AMD_LICENSE		AMD_LICENSE
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Ryzen AI SD Server

Overview

Supported Models

Building from Source

Prerequisites

Build Steps (Windows)

Build Steps (Linux)

Build Output

Custom ONNX Runtime Path

Deploy Extra Runtime DLLs

Code Structure

Architecture Overview

Design Principles

Component Layers

Dependencies

Usage

CLI Mode (one-shot generation)

Server Mode (HTTP API)

Command-Line Arguments

API Endpoints

Health Check

Image Generation

Testing

Integration with Lemonade Server

Integration Examples

Python with requests

Development

Code Style

Building for Development

Related Projects

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages