OmniInfer

Easy, fast, and private LLM & VLM inference for every device

| Getting Started | Documentation | Architecture |

About

OmniInfer is a high-performance, cross-platform inference engine for running Large Language Models (LLM) and Vision-Language Models (VLM) locally. It abstracts away model compilation, hardware adaptation, and deployment complexity, enabling efficient local inference with minimal configuration.

OmniInfer powers the inference layer of Omni Studio, a unified model orchestration platform.

OmniInfer is fast with:

Optimized token generation speed and minimal memory footprint
Multiple backend engines (llama.cpp, mnn, et, mlx, OmniInfer Native) for best-fit performance
Hardware-aware adaptation and optimization

OmniInfer is flexible and easy to use with:

Seamless multi-backend switching — choose the best engine for your workload
OpenAI-compatible API server for drop-in integration
Support for LLM, VLM, and World Models
Fine-grained parameter control (context length, GPU offloading, KV cache, etc.)

OmniInfer runs everywhere:

Linux, macOS, Windows — desktop & server
Android, iOS — mobile & edge devices
One codebase, all platforms

Getting Started

Quick Install

macOS, Linux, and Android:

curl -fsSL https://raw.githubusercontent.com/omnimind-ai/OmniInfer/main/scripts/install.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/omnimind-ai/OmniInfer/main/scripts/install.ps1 | iex

The installer detects your platform and hardware, recommends a backend, and walks you through model setup interactively.

Source Checkout

If you already cloned this repository, build at least one local runtime backend first.

Windows: see Build Guide: Windows
Linux: see Build Guide: Linux
macOS: see Build Guide: macOS
Android: see Build Guide: Android

After the runtime is ready, start with the OmniInfer CLI from the repository root.

Linux and macOS:

./omniinfer --help

Windows:

.\omniinfer.cmd --help

Android:

./omniinfer --help

Packaged Release

If you are using a packaged release that already includes runtime/, you can run the CLI immediately from the release directory:

Windows:

.\omniinfer.cmd --help

Linux and macOS:

./omniinfer --help

Documentation

Recommended docs:

CLI Guide: end-to-end CLI usage for Linux, macOS, Windows, and Android
Android CLI Notes: Android direct-mode details
Android JNI Bridge: generate an Android App bridge and packaged runtime assets
Build Guide: build and platform packaging notes
API Reference: OpenAI-compatible local API usage

Architecture

Citation

If you use OmniInfer in research, please cite this repository. GitHub can automatically generate citation formats from CITATION.cff.

@software{omniinfer,
  author = {{Omnimind AI}},
  title = {OmniInfer},
  url = {https://github.com/omnimind-ai/OmniInfer}
}

Contributing

We welcome and value any contributions and collaborations. Please check out Contributing to OmniInfer for how to get involved.

License

This project is licensed under the Apache License 2.0 — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 381 Commits
android/omniinfer-server		android/omniinfer-server
docs		docs
framework		framework
ios		ios
scripts		scripts
service_core		service_core
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
omniinfer		omniinfer
omniinfer.cmd		omniinfer.cmd
omniinfer.py		omniinfer.py
omniinfer_gateway.py		omniinfer_gateway.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OmniInfer

About

Getting Started

Quick Install

Source Checkout

Packaged Release

Documentation

Architecture

Citation

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OmniInfer

About

Getting Started

Quick Install

Source Checkout

Packaged Release

Documentation

Architecture

Citation

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages