Overview: An Easy Button for Agentic RAG

This RAG application uses an agentic approach to combine web search, hallucination control and accuracy checks with RAG. It's easy to modify because its a simple Gradio app.

Note This app runs in NVIDIA AI Workbench. It's a free, lightweight developer platform that you can run on your own systems to get up and running with complex AI applications and workloads in a short amount of time.

You may want to fork this repository into your own account before proceeding. Otherwise you won't be able to save your local changes to GitHub because this NVIDIA owned repository is read-only.

Navigating the README: Application Overview | Get Started | Deep Dive | License

Other Resources: ⬇️ Download AI Workbench | 📖 User Guide |📂 Other Projects | 🚨 User Forum

Need, Don't Need and Nice to Have

Need: internet access because the chat app uses Tavily for web-searches, as well as endpoints on build.nvidia.com
Don't Need: Local GPU
Nice to Have: Remote GPU system where you self-host an endpoint

The Agentic RAG Application

Using the Application

You embed your documents (pdfs or webpages) to the vector database.
You configure each of the separate components for the pipeline. For each component you can:
- Select from a drop down of endpoints or use a self-hosted endpoint.
- Modify the prompt.
You submit your query.
An LLM evaluates its relevance to the index and then routes it to the DB or to search by Tavily.
Answers are checked for hallucination and relevance. "Failing"" answers are run through the process again.

The diagram below shows this agentic flow.

Modifying the Application

Directly within the app you can:
- Change the prompts for the different components, e.g. the hallucination grader.
- Change the webpages and pdfs you want to use for the context in the RAG.
- Select different endpoints from build.nvidia.com for the inference components.
- Configure it to use self-hosted endpoints with NVIDIA Inference Microservices (NIMs) or Ollama.
You can also modify the application code to:
- Add new endpoints and endpoint providers
- Change the Gradio interface or the application structure and logic.

Note Setting up self-hosted endpoints is relatively advanced because you will need to do it manually.

Get Started

The quickest path is with the pre-configured build.nvidia.com endpoints.

Prerequisites for Using Pre-configured Endpoints

Install AI Workbench.
Get an NVIDIA Developer Account and an API key.
- Go to build.nvidia.com and click Login.
- Create account, verify email.
- Make a Cloud Account.
- Click your initial > API Keys.
- Create and save your key.
Get a Tavily account and an API key.
- Go to Tavily and create an account.
- Create an API key on the overview page.
Have some pdfs or web pages to put in the RAG.
NVIDIA Employees: Configure INTERNAL_API API key to use internal endpoints instead of public ones.

Opening the Chat

Open NVIDIA AI Workbench. Select a location to work in.
Use the repository URL to clone this project with AI Workbench and wait for it to build.
Add your NVIDIA API key and the Tavily API key when prompted.
Open the Chat from Workbench. It should automatically open in a new browser tab.
Upload your documents and change the Router prompt to focus on your uploaded documents.
Start chatting.

Deep Dive on Self-Hosted Endpoints

Note This assumes you've done the Get Started steps.

Using Self-Hosted Endpoints

You can configure any or all pipeline components (Router, Generator, Retrieval, Hallucination Check, Answer Check) to use self-hosted endpoints independently. This means you can mix and match between hosted and self-hosted components based on your needs. The application includes built-in GPU compatibility checking to help you select appropriate models for your hardware configuration.

Prerequisites:

NVIDIA GPU(s) with appropriate VRAM
Ubuntu 22.04 or later with latest NVIDIA drivers
Docker and NVIDIA Container Toolkit

To set up NIM endpoints for your components:

Check the NIM documentation for detailed setup instructions
For each component you want to self-host:
- Select "NIM Endpoints" in the component's configuration
- Choose your GPU type and count - the UI will automatically show only compatible models
- Enter your endpoint details (host, port)
Components not set to self-hosted will continue using their configured cloud endpoints

The application will validate your GPU configuration for each component and prevent incompatible model selections. You can use different GPU configurations for different components based on their computational needs.

License

This NVIDIA AI Workbench example project is under the Apache 2.0 License

This project may utilize additional third-party open source software projects. Review the license terms of these open source projects before use. Third party components used as part of this project are subject to their separate legal notices or terms that accompany the components. You are responsible for confirming compliance with third-party component license terms and requirements.

❓ Have Questions?
Please direct any issues, fixes, suggestions, and discussion on this project to the DevZone Members Only Forum thread here

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.project		.project
code		code
data		data
models		models
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
apt.txt		apt.txt
postBuild.bash		postBuild.bash
preBuild.bash		preBuild.bash
requirements.txt		requirements.txt
variables.env		variables.env

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview: An Easy Button for Agentic RAG

Need, Don't Need and Nice to Have

The Agentic RAG Application

Using the Application

Modifying the Application

Get Started

Prerequisites for Using Pre-configured Endpoints

Opening the Chat

Deep Dive on Self-Hosted Endpoints

Using Self-Hosted Endpoints

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Overview: An Easy Button for Agentic RAG

Need, Don't Need and Nice to Have

The Agentic RAG Application

Using the Application

Modifying the Application

Get Started

Prerequisites for Using Pre-configured Endpoints

Opening the Chat

Deep Dive on Self-Hosted Endpoints

Using Self-Hosted Endpoints

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages