amd · bconsolvo · Oct 6, 2025 · Oct 6, 2025 · Oct 6, 2025 · Oct 6, 2025
diff --git a/README.md b/README.md
@@ -1,44 +1,59 @@
-# LIRA: Local Inference tool for Realtime Audio
+# LIRA: Local Inference for Realtime Audio
 <p align="center">
     <p align="center">
         <img src="images/logo.png" alt="LIRA logo" width="1280" style="border-radius:24px; height:400px; object-fit:cover;">
     </p>
 
-**Local, efficient speech recognition.  
-Run ASR models on your machine—fast, simple, and developer-friendly.**
+**Local, efficient automatic speech recognition (ASR).  Run ASR models on your local machine—fast, simple, and developer-friendly.**
 
-LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally with `lira run` and `lira serve` to integrate with your apps and tools.
+LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally simply with the `lira run` and `lira serve` commands to integrate with your apps and tools.
+
+
+
+## 🧩 Supported Model Architectures & Runtimes
+
+LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime.
+
+| Model                | Typical use case                        | Runs on         | Supported datatypes                |
+|----------------------|-----------------------------------------|-----------------|------------------------------------|
+| [whisper-small](https://huggingface.co/openai/whisper-small)      | Low-latency, resource-constrained       | CPU, GPU, NPU*  | FP32, BFP16                        |
+| [whisper-base](https://huggingface.co/openai/whisper-base)       | Balanced accuracy and performance       | CPU, GPU, NPU*  | FP32, BFP16                        |
+| [whisper-medium](https://huggingface.co/openai/whisper-medium)     | Higher accuracy for challenging audio   | CPU, GPU, NPU*  | FP32, BFP16                        |
+| [whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)      | Highest accuracy (more compute)         | CPU, GPU        | FP32, BFP16                        |
+| [zipformer](https://huggingface.co/papers/2310.11230)            | Streaming / low-latency ASR encoder     | CPU, GPU, NPU*  | FP32, BFP16                        |
+
+<sub>*NPU support depends on available Vitis AI export artifacts and target hardware.</sub>
 
----
 
 ## 🚀 Getting Started
 
 **Prerequisites:**
 
-- **Python 3.10** is required.
+- **Python 3.11** is required.
 - We recommend using **conda** for environment management.
-- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to nebale NPU use cases.
-- Current recommended Ryzen AI Version: RAI 1.5.1 with 32.0.203.280 driver.
+- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to enable NPU use cases.
+- Current recommended Ryzen AI version 1.6.0 with the 32.0.203.280 NPU driver.
 
 **Minimal install steps:**
 
 1. **Clone the repo and change directory:**
     ```bash
-    git clone https://github.com/aigdat/LIRA.git
+    git clone https://github.com/amd/LIRA.git
     cd LIRA
     ```
 
 2. **Activate your conda environment:**
+This conda environment should already be installed from the Ryzen AI SW installation mentioned earlier.
     ```bash
-    conda activate ryzen-ai-1.5.0
+    conda activate ryzen-ai-1.6.0
     ```
 
 3. **Install LIRA in editable mode:**
     ```bash
     pip install -e .
     ```
 
-Now you can run `lira --help` to see available commands.
+You can run `lira --help` to see available commands.
 
 ---
 
@@ -48,40 +63,26 @@ LIRA is a CLI-first toolkit focused on simple developer workflows for exporting,
 
 **Primary commands:**
 
-- **`lira run`**  
-    Run, export, or benchmark models directly from the command line.  
-    Use for local inference, ONNX export, or rapid prototyping.
-
-- **`lira serve`**  
-    Launch a FastAPI server with OpenAI-compatible endpoints.  
-    Expose models as HTTP APIs for real-time transcription and seamless integration.  
-    Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
+1. Run models locally with `lira run`:
 
-**Quick examples:**
+- Run, export, or benchmark models locally directly from the command line.  
+- Use for local inference, ONNX export, or rapid prototyping.
 ```bash
-# Run a model locally (inference)
 lira run whisper --model-type whisper-base --export --device cpu --audio audio_files/test.wav
-
-# Serve the model for local apps (OpenAI-compatible endpoints)
-lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
 ```
 
----
-
-## 🖥️ LIRA Server
+2. OpenAI API-compatible local model serving with `lira serve`:
 
-LIRA includes a FastAPI-based HTTP server for rapid integration with your applications. The server offers **OpenAI API compatibility** for real-time speech recognition.
+- Launch and serve the model with a FastAPI server with OpenAI API-compatible endpoints.  
+- Expose models as HTTP APIs for real-time transcription and seamless integration.  
+- Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
+```bash
+lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
+```
 
-**Start the server:**
+**NPU Acceleration:**
 
-- **CPU acceleration:**
-    ```bash
-    lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
-    ```
-- **NPU acceleration:**
-    ```bash
-    lira serve --backend openai --model whisper-base --device npu --host 0.0.0.0 --port 5000
-    ```
+For NPU acceleration, change `--device cpu` to `--device npu`.
 
 > Interested in more server features?  
 > Try the **LIRA server demo** with Open WebUI.  
@@ -115,11 +116,13 @@ _Tip: run `lira run <model> --help` for model-specific flags._
 Whisper supports export/optimization and model-specific flags.
 
 **Example:**
+Export Whisper base model to ONNX, optimize and run on NPU.
 ```bash
-# Export Whisper base model to ONNX, optimize and run on NPU
 lira run whisper --model-type whisper-base --export --device npu --audio <input/.wav file> --use-kv-cache
+```
 
-# Run inference on a sample audio file
+Run inference on a sample audio file on CPU.
+```bash
 lira run whisper -m exported_models/whisper_base --device cpu --audio "audio_files/test.wav"
 ```
 
@@ -166,19 +169,7 @@ You can point to custom config files or modify those in the repo.
 
 ---
 
-## 🧩 Supported Model Architectures & Runtimes
-
-LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime.
-
-| Model                | Typical use case                        | Runs on         | Supported datatypes                |
-|----------------------|-----------------------------------------|-----------------|------------------------------------|
-| Whisper (small)      | Low-latency, resource-constrained       | CPU, GPU, NPU*  | FP32, BFP16                        |
-| Whisper (base)       | Balanced accuracy and performance       | CPU, GPU, NPU*  | FP32, BFP16                        |
-| Whisper (medium)     | Higher accuracy for challenging audio   | CPU, GPU, NPU*  | FP32, BFP16                        |
-| Whisper (large)      | Highest accuracy (more compute)         | CPU, GPU        | FP32, BFP16                        |
-| Zipformer            | Streaming / low-latency ASR encoder     | CPU, GPU, NPU*  | FP32, BFP16                        |
 
-<sub>*NPU support depends on available Vitis AI export artifacts and target hardware.</sub>
 
 ## 🧪 Early Access & Open Source Intentions
 

diff --git a/docs/OpenWebUI_README.md b/docs/OpenWebUI_README.md
@@ -9,20 +9,15 @@ Ready to turn your browser into a voice-powered AI playground? With LIRA and Ope
 ## 1. Set up environments
 
 **Recommended:** Use separate conda environments to avoid dependency conflicts.  
-- For LIRA, reuse `ryzen-ai-1.5.0` to leverage NPU support.  
+- For LIRA, reuse `ryzen-ai-1.6.0` to leverage NPU support.  
 - For OpenWebUI, create a new environment.
 
-### LIRA setup:
+### LIRA and OpenWebUI setup:
 Follow the instructions in the [Getting Started](../README.md#getting-started) section of the main README.md to install and set up the Ryzen AI environment.
-```powershell
-conda activate ryzen-ai-1.5.0
-lira serve --help
-```
 
-### OpenWebUI setup:
-In a new environment, let's set up OpenWebUI:
+Let's set up OpenWebUI by first cloning the ryzen-ai-1.6.0 environment, and then installing `open-webui`.
 ```powershell
-conda create -n openwebui python=3.11 -y
+conda create -n openwebui python=3.11 -y --clone ryzen-ai-1.6.0
 conda activate openwebui
 pip install open-webui
 ```
@@ -90,7 +85,7 @@ Record from your mic or upload audio files (`.wav`, `.mp3`)—OpenWebUI will sen
 ## 📝 Notes & Tips
 
 - If you exported a Whisper ONNX model to a custom directory, set `LIRA_MODEL_DIR` before starting the server, or use `lira serve` flags to point at the export.
-- For NPU runs, start `lira serve` from `ryzen-ai-1.5.0` so Vitis AI tooling and drivers are available.
+- For NPU runs, start `lira serve` from `ryzen-ai-1.6.0` so Vitis AI tooling and drivers are available.
 - If running behind a reverse proxy, update OpenWebUI's API Base URL accordingly.
 
 See the main [README.md](../README.md) for full LIRA setup and model export instructions.