AI Voice Cloning

Note I do not plan on actively working on improvements/enhancements for this project, this is mainly meant to keep the repo in a working state in the case the original git.ecker goes down or necessary package changes need to be made.

That being said, some enhancements added compared to the original repo:

✔️ Possible to train in other languages

✔️ Hifigan added, allowing for faster inference at the cost of quality.

✔️ whisper-v3 added as a chooseable option for whisperx

✔️ Output conversion using RVC

This is a fork of the repo originally located here: https://git.ecker.tech/mrq/ai-voice-cloning. All of the work that was put into it to incoporate training with DLAS and inference with Tortoise belong to mrq, the author of the original ai-voice-cloning repo.

Setup

Check my youtube video for instruction https://youtu.be/wuB8GLdS7-4?si=z_Y5KxGpPcLycnB6
This instruction For setup on Runpod.io , if you don't have runpod account create one and add credit minimum 10USD

Alternative Manual Installation

In runpod choose template

Pod setup lool like this

Clone the repository

git clone https://github.com/gordon123/ai-voice-cloning.git

** this repo I updated only main.py to run on the gradio public all the work go to https://github.com/JarodMica/ai-voice-cloning or his Youtube tutorials below

create venv type
```
python -m venv venv
```
activate it
```
 source /workspace/venv/bin/activate
```
now you should see (venv) at the begining of the command line
run
```
cd ai-voice-cloning

bash setup-cuda.sh
```
It will start running through all of the python packages needed !!! WAIT WAIT WAIT
For sometime it will show this error , hit Ctrl+ C to exist
```
import tkinter as tk
ModuleNotFoundError: No module named 'tkinter'
```
Ctrl + C to exist then type
```
apt-get update
apt-get install python3.11-tk
```
After it finishes, run
```
bash start.sh
```
For this first time and this will start downloading most of the models you'll need. WAIT WAIT WAIT!!!!!
- Some models are downloaded when you first use them. You'll incur additional downloads during generation and when training (for whisper). However, once they are finished, you won't ever have to download them again as long as you don't delete them. They are located in the models folder of the root.
If you see this message Removing weight norm...Loaded vocoder model....Loaded TTS, ready for generation. then look up find Gradio link like in this photo

you should see something like this Running on public URL: https://7f6e62958285392788.gradio.live copy this link to browser and test to generate some text Have fun!!

Lastly, delete your pod after you download your training model or any generated files, otherwise Runpod will charge you over time!

CLICK THE BIN ICON, IT WILL PERMANENTLY DELETE EVERYTHING, BACK UP FILES YOU NEED FIRST

(Optional) You can opt to install whisperx for training by running setup-whipserx.bat
- Check out the whisperx github page for more details, but it's much faster for longer audio files. If you're processing one-by-one with an already split dataset, it doesn't improve speeds that much.

Instructions

Checkout the YouTube video:

Watch First: https://youtu.be/WWhNqJEmF9M?si=RhUZhYersAvSZ4wf

Watch Second (RVC update): https://www.youtube.com/watch?v=7tpWH8_S8es&t=504s

Everything is pretty much the same as before if you've used this repository in the past, however, there is a new option to convert text output using rvc. Before you can use it, you will need a trained RVC .pth file that you get from RVC or online, and then you will need to place it in models/rvc_models/. Both .index and .pth files can be placed in here and they'll show up correctly in their respective dropdown menus.

To enable rvc:

Check and enable Show Experimental Settings to reveal more options
Check and enable Run the outputter audio through RVC. You will now have access to parameters you could adjust in RVC for the RVC voice model you're using.

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
bin		bin
config		config
models		models
modules		modules
src		src
training		training
voices		voices
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
changelog.md		changelog.md
deepspeed-0.8.3+6eca037c-cp39-cp39-win_amd64.whl		deepspeed-0.8.3+6eca037c-cp39-cp39-win_amd64.whl
download_ffmpeg.bat		download_ffmpeg.bat
notebook_colab.ipynb		notebook_colab.ipynb
notebook_paperspace.ipynb		notebook_paperspace.ipynb
reload_flag.txt		reload_flag.txt
requirements.txt		requirements.txt
setup-cuda-bnb.bat		setup-cuda-bnb.bat
setup-cuda-cpu.bat		setup-cuda-cpu.bat
setup-cuda-cpu.sh		setup-cuda-cpu.sh
setup-cuda.bat		setup-cuda.bat
setup-cuda.sh		setup-cuda.sh
setup-directml.bat		setup-directml.bat
setup-docker.sh		setup-docker.sh
setup-rocm-bnb.sh		setup-rocm-bnb.sh
setup-rocm.sh		setup-rocm.sh
setup-whisperx.bat		setup-whisperx.bat
start-docker.sh		start-docker.sh
start.bat		start.bat
start.sh		start.sh
train-docker.sh		train-docker.sh
train.bat		train.bat
train.sh		train.sh
update-force.bat		update-force.bat
update-force.sh		update-force.sh
update.bat		update.bat
update.sh		update.sh
update_package.bat		update_package.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Voice Cloning

Setup

Alternative Manual Installation

CLICK THE BIN ICON, IT WILL PERMANENTLY DELETE EVERYTHING, BACK UP FILES YOU NEED FIRST

Instructions

About

Releases

Packages

Languages

License

gordon123/ai-voice-cloning

Folders and files

Latest commit

History

Repository files navigation

AI Voice Cloning

Setup

Alternative Manual Installation

CLICK THE BIN ICON, IT WILL PERMANENTLY DELETE EVERYTHING, BACK UP FILES YOU NEED FIRST

Instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages