In this repository, we will demonstrate how to build your own local voice chatbot using open-source resources. Specifically, we use Nvidia Riva to facilitate the conversion between voice and text, and employ the Meta Llama large language model to generate answers to questions. This repository implements the voice chatbot with very concise code, allowing readers to quickly grasp the underlying principles.
We will install two services, Riva and text-generation-inference( we can load large models in this inference service), on Jetson and use the code from this repository as a client for functional testing.
- Jetson device with more than 16GB of memory.
- The hardware device needs to be pre-flashed with the jetpack 5.1.1 operating system.
(I completed all experiments using Jetson AGX Orin 32GB H01 Kit, but you can try loading smaller models with a device that has less memory.)
- Step1. Install Nvidia Riva Server on Jetson.
- Step2. Deploy text-generation-inference on Jetson
- Step3. Clone local chatbot demo.
git clone https://github.com/yuyoujiang/Deploy-Riva-LLama-on-Jetson.git
Note: Each of the following steps requires opening a new terminal window.
- Step1. Start the Riva Server.
cd <path to riva quickstart directory> bash riva_start.sh
- Step2. Use the
text-generation-inference
to loadLlama
large language model.text-generation-launcher --model-id meta-llama/Llama-2-7b-chat-hf --port 8899
- Step3. Run local chatbot demo.
- Enter the working directory.
cd <path to Deploy-Riva-LLama-on-Jetson>
- Query audio input devices.
python3 local_chatbot.py --list-input-devices
- Query audio output devices.
python3 local_chatbot.py --list-output-devices
- Run the script.
python3 local_chatbot.py --input-device <your device id> --output-device <your device id> # For example: python3 local_chatbot.py --input-device 25 --output-device 30
- Enter the working directory.