diff --git a/README.md b/README.md index 99e053bfb..47f2df219 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ WanGP supports the Wan (and derived models) but also Hunyuan Video, Flux, Qwen, Z-Image, LongCat, Kandinsky, LTXV, LTX-2, Qwen3 TTS, Chatterbox, HearMula, ... with: - Low VRAM requirements (as low as 6 GB of VRAM is sufficient for certain models) - Support for old Nvidia GPUs (RTX 10XX, 20xx, ...) -- Support for AMD GPUs Radeon RX 76XX, 77XX, 78XX & 79XX, instructions in the Installation Section Below. +- Support for AMD GPUs (RDNA 4, 3, 3.5, and 2), instructions in the Installation Section Below. - Very Fast on the latest GPUs - Easy to use Full Web based interface - Support for many checkpoint Quantized formats: int8, fp8, gguf, NV FP4, Nunchaku @@ -318,7 +318,7 @@ For detailed installation instructions for different GPU generations: ### AMD For detailed installation instructions for different GPU generations: -- **[Installation Guide](docs/AMD-INSTALLATION.md)** - Complete setup instructions for Radeon RX 76XX, 77XX, 78XX & 79XX +- **[Installation Guide](docs/AMD-INSTALLATION.md)** - Complete setup instructions for RDNA 4, 3, 3.5, and 2 ## 🎯 Usage diff --git a/docs/AMD-INSTALLATION.md b/docs/AMD-INSTALLATION.md index 4f05589eb..8663cb0db 100644 --- a/docs/AMD-INSTALLATION.md +++ b/docs/AMD-INSTALLATION.md @@ -1,72 +1,79 @@ -# Installation Guide +# AMD Installation Guide for Windows (TheRock) -This guide covers installation for specific RDNA3 and RDNA3.5 AMD CPUs (APUs) and GPUs -running under Windows. +This guide covers installation for AMD GPUs and APUs running under Windows using TheRock's official PyTorch wheels. -tl;dr: Radeon RX 7900 GOOD, RX 9700 BAD, RX 6800 BAD. (I know, life isn't fair). +## Supported GPUs -Currently supported (but not necessary tested): +Based on [TheRock's official support matrix](https://github.com/ROCm/TheRock/blob/main/SUPPORTED_GPUS.md), the following GPUs are supported on Windows: -**gfx110x**: +### **gfx110X-all** (RDNA 3): +* AMD RX 7900 XTX (gfx1100) +* AMD RX 7800 XT (gfx1101) +* AMD RX 7700 XT (gfx1101) +* AMD RX 7700S / Framework Laptop 16 (gfx1102) +* AMD Radeon 780M Laptop iGPU (gfx1103) -* Radeon RX 7600 -* Radeon RX 7700 XT -* Radeon RX 7800 XT -* Radeon RX 7900 GRE -* Radeon RX 7900 XT -* Radeon RX 7900 XTX +### **gfx120X-all** (RDNA 4): +* AMD RX 9060 XT (gfx1200) +* AMD RX 9060 (gfx1200) +* AMD RX 9070 XT (gfx1201) +* AMD RX 9070 (gfx1201) -**gfx1151**: +### **gfx1151** (RDNA 3.5 APU): +* AMD Strix Halo APUs -* Ryzen 7000 series APUs (Phoenix) -* Ryzen Z1 (e.g., handheld devices like the ROG Ally) +### **gfx1150** (RDNA 3.5 APU): +* AMD Radeon 890M (Ryzen AI 9 HX 370 - Strix Point) -**gfx1201**: +### Also supported: +### **gfx103X-dgpu**: (RDNA 2) -* Ryzen 8000 series APUs (Strix Point) -* A [frame.work](https://frame.work/au/en/desktop) desktop/laptop +
+> **Note:** If your GPU is not listed above, it may not be supported by TheRock on Windows. Support status and future updates can be found in the [official documentation](https://github.com/ROCm/TheRock/blob/main/SUPPORTED_GPUS.md). ## Requirements -- Python 3.11 (3.12 might work, 3.10 definately will not!) +- Python 3.11 (recommended for Wan2GP - TheRock currently supports Python 3.11, 3.12, and 3.13). +- Windows 10/11 ## Installation Environment -This installation uses PyTorch 2.7.0 because that's what currently available in -terms of pre-compiled wheels. +This installation uses PyTorch wheels built by TheRock. ### Installing Python -Download Python 3.11 from [python.org/downloads/windows](https://www.python.org/downloads/windows/). Hit Ctrl+F and search for "3.11". Dont use this direct link: [https://www.python.org/ftp/python/3.11.9/python-3.11.9-amd64.exe](https://www.python.org/ftp/python/3.11.9/python-3.11.9-amd64.exe) -- that was an IQ test. +Download Python 3.11 from [python.org/downloads/windows](https://www.python.org/downloads/windows/). Press Ctrl+F and search for "3.11." to find the newest version available for installation. -After installing, make sure `python --version` works in your terminal and returns 3.11.x +Alternatively, you can use this direct link: [Python 3.11.9 (64-bit)](https://www.python.org/ftp/python/3.11.9/python-3.11.9-amd64.exe). -If not, you probably need to fix your PATH. Go to: +After installing, make sure `python --version` works in your terminal and returns `3.11.9` -* Windows + Pause/Break -* Advanced System Settings -* Environment Variables -* Edit your `Path` under User Variables +If it doesn’t, you need to add Python to your PATH: -Example correct entries: +* Press the `Windows` key, type `Environment Variables`, and select `Edit the system environment variables`. +* In the `System Properties` window, click `Environment Variables…`. +* Under `User variables`, find `Path`, then click `Edit` → `New` and add the following entries (replace `` with your Windows username): ```cmd -C:\Users\YOURNAME\AppData\Local\Programs\Python\Launcher\ -C:\Users\YOURNAME\AppData\Local\Programs\Python\Python311\Scripts\ -C:\Users\YOURNAME\AppData\Local\Programs\Python\Python311\ +C:\Users\\AppData\Local\Programs\Python\Launcher\ +C:\Users\\AppData\Local\Programs\Python\Python311\Scripts\ +C:\Users\\AppData\Local\Programs\Python\Python311\ ``` -If that doesnt work, scream into a bucket. +> **Note:** If Python still doesn't show the correct version after updating PATH, try signing out and signing back in to Windows to apply the changes. ### Installing Git -Get Git from [git-scm.com/downloads/win](https://git-scm.com/downloads/win). Default install is fine. +Download Git from [git-scm.com/downloads/windows](https://git-scm.com/install/windows) and install it. The default installation options are fine. -## Install (Windows, using `venv`) +## Installation Steps (Windows, using a Python `venv`) +> **Note:** The following commands are intended for use in the Windows Command Prompt (CMD). +> If you are using PowerShell, some commands (like comments and activating the virtual environment) may differ. -### Step 1: Download and Set Up Environment + +### Step 1: Download and set up Wan2GP Environment ```cmd :: Navigate to your desired install directory @@ -76,71 +83,192 @@ cd \your-path-to-wan2gp git clone https://github.com/deepbeepmeep/Wan2GP.git cd Wan2GP -:: Create virtual environment using Python 3.10.9 +:: Create virtual environment python -m venv wan2gp-env :: Activate the virtual environment wan2gp-env\Scripts\activate ``` -### Step 2: Install PyTorch +> **Note:** If you have multiple versions of Python installed, use `py -3.11 -m venv wan2gp-env` instead of `python -m venv wan2gp-env` to ensure the correct version is used. + +### Step 2: Install ROCm/PyTorch by TheRock + +**IMPORTANT:** Choose the correct index URL for your GPU family! + +#### For gfx110X-all (RX 7900 XTX, RX 7800 XT, etc.): -The pre-compiled wheels you need are hosted at [scottt's rocm-TheRock releases](https://github.com/scottt/rocm-TheRock/releases). Find the heading that says: +```cmd +pip install --pre torch torchaudio torchvision rocm[devel] --index-url https://rocm.nightlies.amd.com/v2/gfx110X-all/ +``` -**Pytorch wheels for gfx110x, gfx1151, and gfx1201** +#### For gfx120X-all (RX 9060, RX 9070, etc.): -Don't click this link: [https://github.com/scottt/rocm-TheRock/releases/tag/v6.5.0rc-pytorch-gfx110x](https://github.com/scottt/rocm-TheRock/releases/tag/v6.5.0rc-pytorch-gfx110x). It's just here to check if you're skimming. +```cmd +pip install --pre torch torchaudio torchvision rocm[devel] --index-url https://rocm.nightlies.amd.com/v2/gfx120X-all/ +``` + +#### For gfx1151 (Strix Halo iGPU): + +```cmd +pip install --pre torch torchaudio torchvision rocm[devel] --index-url https://rocm.nightlies.amd.com/v2/gfx1151/ +``` + +#### For gfx1150 (Radeon 890M - Strix Point): + +```cmd +pip install --pre torch torchaudio torchvision rocm[devel] --index-url https://rocm.nightlies.amd.com/v2-staging/gfx1150/ +``` -Copy the links of the closest binaries to the ones in the example below (adjust if you're not running Python 3.11), then hit enter. +#### For gfx103X-dgpu (RDNA 2): ```cmd -pip install ^ - https://github.com/scottt/rocm-TheRock/releases/download/v6.5.0rc-pytorch-gfx110x/torch-2.7.0a0+rocm_git3f903c3-cp311-cp311-win_amd64.whl ^ - https://github.com/scottt/rocm-TheRock/releases/download/v6.5.0rc-pytorch-gfx110x/torchaudio-2.7.0a0+52638ef-cp311-cp311-win_amd64.whl ^ - https://github.com/scottt/rocm-TheRock/releases/download/v6.5.0rc-pytorch-gfx110x/torchvision-0.22.0+9eb57cd-cp311-cp311-win_amd64.whl +pip install --pre torch torchaudio torchvision rocm[devel] --index-url https://rocm.nightlies.amd.com/v2-staging/gfx103X-dgpu/ ``` -### Step 3: Install Dependencies +This will automatically install the latest PyTorch, torchaudio, and torchvision wheels with ROCm support. + +### Step 3: Install Wan2GP Dependencies ```cmd :: Install core dependencies pip install -r requirements.txt ``` +### Step 4: Verify Installation + +```cmd +python -c "import torch; print('PyTorch:', torch.__version__); print('ROCm available:', torch.cuda.is_available()); print('Device:', torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'No GPU')" +``` + +Expected output example: +``` +PyTorch: 2.11.0+rocm7.12.0 +ROCm available: True +Device: AMD Radeon RX 9070 XT +``` + ## Attention Modes -WanGP supports several attention implementations, only one of which will work for you: +WanGP supports multiple attention implementations via [triton-windows](https://github.com/woct0rdho/triton-windows/). -- **SDPA** (default): Available by default with PyTorch. This uses the built-in aotriton accel library, so is actually pretty fast. +First, install `triton-windows` in your virtual environment. +If you have an older version of Triton installed, uninstall it first. +ROCm SDK needs to be initialized. +Visual Studio environment should also be activated. -## Performance Profiles +```cmd +pip uninstall triton +pip install triton-windows +rocm-sdk init +"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars64.bat" >nul 2>&1 +``` -Choose a profile based on your hardware: +### Supported attention implementations -- **Profile 3 (LowRAM_HighVRAM)**: Loads entire model in VRAM, requires 24GB VRAM for 8-bit quantized 14B model -- **Profile 4 (LowRAM_LowVRAM)**: Default, loads model parts as needed, slower but lower VRAM requirement +- **SageAttention V1** (Requires the `.post26` wheel or newer to fix Triton compilation issues without needing unofficial patches. Download it from [this](https://github.com/Comfy-Org/wheels/actions/runs/21343435018) URL) + +```cmd +pip install "sageattention <2" +``` + +- **FlashAttention-2** (Only the Triton backend is supported): +```cmd +git clone https://github.com/Dao-AILab/flash-attention.git +cd flash-attention +pip install ninja +pip install packaging +set FLASH_ATTENTION_TRITON_AMD_ENABLE=TRUE && python setup.py install +``` + +- **SDPA Flash**: Available by default in PyTorch on post-RDNA2 GPUs via AOTriton. ## Running Wan2GP -In future, you will have to do this: +For future sessions, activate the environment every time if it isn't already activated, then run `python wgp.py`: ```cmd -cd \path-to\wan2gp -wan2gp\Scripts\activate.bat +cd \path-to\Wan2GP +wan2gp-env\Scripts\activate +:: Add the AMD-specific environment variables mentioned below here python wgp.py ``` -For now, you should just be able to type `python wgp.py` (because you're already in the virtual environment) +It is advised to set the following environment variables at the start of every new session (you can create a `.bat` file that activates your venv, sets these, then launches `wgp.py`): + +```cmd +set ROCM_HOME=%ROCM_ROOT% +set PATH=%ROCM_ROOT%\lib\llvm\bin;%ROCM_BIN%;%PATH% +set CC=clang-cl +set CXX=clang-cl +set DISTUTILS_USE_SDK=1 +set FLASH_ATTENTION_TRITON_AMD_ENABLE=TRUE +set TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 +``` + +MIOpen (AMD’s equivalent of NVIDIA’s cuDNN) is not yet fully stable on several architectures; it can cause out-of-memory errors (OOMs), crash the display driver, or significantly increase generation times. Currently, it is recommended to either use fast mode by setting: + +```cmd +set MIOPEN_FIND_MODE=FAST +``` + +Alternatively, you can disable MIOpen entirely by editing `wgp.py` and adding the following line below `import torch` (around line 51): + +```cmd +... +:: /Lines already in the file/ +:: import torch +torch.backends.cudnn.enabled = False # <-- Add this here +:: import gc +... +``` + +To verify that it is disabled, or to enable verbose logging, you can set: + +```cmd +set MIOPEN_ENABLE_LOGGING=1 +set MIOPEN_ENABLE_LOGGING_CMD=1 +set MIOPEN_LOG_LEVEL=5 +``` ## Troubleshooting -- If you use a HIGH VRAM mode, don't be a fool. Make sure you use VAE Tiled Decoding. +### GPU Not Detected + +If `torch.cuda.is_available()` returns `False`: + +1. **Verify your GPU is supported** - Check the [Supported GPUs](#supported-gpus) list above +2. **Check AMD drivers** - Ensure you have the latest AMD Adrenalin drivers installed +3. **Verify correct index URL** - Make sure you used the right GPU family index URL + +### Installation Errors + +**"Could not find a version that satisfies the requirement":** +- Double-check that you're using the correct `--index-url` for your GPU family. You can also try adding the `--pre` flag or replacing `/v2/` in the URL with `/v2/staging/` +- Ensure you're using Python 3.11, and not 3.10 +- Try adding `--pre` flag if not already present + +**"No matching distribution found":** +- Your GPU architecture may not be supported +- Check that you've activated your virtual environment + +### Performance Issues + +- **Monitor VRAM usage** - Reduce batch size or resolution if running out of memory +- **Close GPU-intensive apps** - Apps with hardware acceleration enabled (browsers, Discord etc.). + +### Known Issues + +Windows packages are new and may be unstable. + +Known issues are tracked at: https://github.com/ROCm/TheRock/issues/808 -### Memory Issues +## Additional Resources -- Use lower resolution or shorter videos -- Enable quantization (default) -- Use Profile 4 for lower VRAM usage -- Consider using 1.3B models instead of 14B models +- [TheRock GitHub Repository](https://github.com/ROCm/TheRock/) +- [Releases Documentation](https://github.com/ROCm/TheRock/blob/main/RELEASES.md) +- [Supported GPU Architectures](https://github.com/ROCm/TheRock/blob/main/SUPPORTED_GPUS.md) +- [Roadmap](https://github.com/ROCm/TheRock/blob/main/ROADMAP.md) +- [ROCm Documentation](https://rocm.docs.amd.com/) -For more troubleshooting, see [TROUBLESHOOTING.md](TROUBLESHOOTING.md) +For additional troubleshooting guidance for Wan2GP, see [TROUBLESHOOTING.md](https://github.com/deepbeepmeep/Wan2GP/blob/main/docs/TROUBLESHOOTING.md).