Skip to content

Commit fc80791

Browse files
Revert "Updated 5_minutes_RAG_no_GPU (#239)" (#247)
This reverts commit 169abdd.
1 parent c52f37f commit fc80791

File tree

5 files changed

+47
-197
lines changed

5 files changed

+47
-197
lines changed

community/5_mins_rag_no_gpu/.streamlit/config.toml

Lines changed: 0 additions & 9 deletions
This file was deleted.
Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,17 @@
1-
# Tutorial for a Generic RAG-Based Chatbot
1+
# RAG in 5 Minutes
22

3-
This is a tutorial for how to build your own generic RAG chatbot. It is intended as a foundation for building more complex, domain-specific RAG bots. Note that no GPU is needed to run this as it is using NIMs from the NVIDIA catalog.
3+
This implementation is tied to the [YouTube video on NVIDIA Developer](https://youtu.be/N_OOfkEWcOk).
44

5-
## Acknowledgements
5+
This is a simple standalone implementation showing a minimal RAG pipeline that uses models available from [NVIDIA API Catalog](https://catalog.ngc.nvidia.com/ai-foundation-models).
6+
The catalog enables you to experience state-of-the-art LLMs accelerated by NVIDIA.
7+
Developers get free credits for 10K requests to any of the models.
68

7-
- This implementation is based on [Rag in 5 Minutes](https://github.com/NVIDIA/GenerativeAIExamples/tree/4e86d75c813bcc41d4e92e430019053920d08c94/community/5_mins_rag_no_gpu), with changes primarily made to the UI.
8-
- Alyssa Sawyer also contributed to updating and further developing this repo during her intern project, [Resume RAG Bot](https://github.com/alysawyer/resume-rag-nv), at NVIDIA.
9+
The example uses an [integration package to LangChain](https://python.langchain.com/docs/integrations/providers/nvidia) to access the models.
10+
NVIDIA engineers develop, test, and maintain the open source integration.
11+
This example uses a simple [Streamlit](https://streamlit.io/) based user interface and has a one-file implementation.
12+
Because the example uses the models from the NVIDIA API Catalog, you do not need a GPU to run the example.
913

10-
## Steps
14+
### Steps
1115

1216
1. Create a python virtual environment and activate it:
1317

@@ -16,10 +20,10 @@ This is a tutorial for how to build your own generic RAG chatbot. It is intended
1620
source genai/bin/activate
1721
```
1822

19-
1. From the root of this repository, install the requirements:
23+
1. From the root of this repository, `GenerativeAIExamples`, install the requirements:
2024

2125
```console
22-
pip install -r requirements.txt
26+
pip install -r community/5_mins_rag_no_gpu/requirements.txt
2327
```
2428

2529
1. Add your NVIDIA API key as an environment variable:
@@ -28,15 +32,17 @@ This is a tutorial for how to build your own generic RAG chatbot. It is intended
2832
export NVIDIA_API_KEY="nvapi-*"
2933
```
3034

31-
If you don't already have an API key, visit the [NVIDIA API Catalog](https://build.ngc.nvidia.com/explore/), select on any model, then click on `Get API Key`.
35+
If you don't already have an API key, visit the [NVIDIA API Catalog](https://build.ngc.nvidia.com/explore/), select on any model, then click on `Get API Key`.
3236

3337
1. Run the example using Streamlit:
3438

3539
```console
36-
streamlit run main.py
40+
streamlit run community/5_mins_rag_no_gpu/main.py
3741
```
3842

3943
1. Test the deployed example by going to `http://<host_ip>:8501` in a web browser.
4044

41-
Click **Browse Files** and select the documents for your knowledge base.
42-
After selecting, click **Upload!** to complete the ingestion process.
45+
Click **Browse Files** and select your knowledge source.
46+
After selecting, click **Upload!** to complete the ingestion process.
47+
48+
You are all set now! Try out queries related to the knowledge base using text from the user interface.

community/5_mins_rag_no_gpu/main.py

Lines changed: 27 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -13,176 +13,110 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515

16-
# This is a simple standalone implementation showing rag pipeline using Nvidia AI Foundational Models.
16+
# This is a simple standalone implementation showing rag pipeline using Nvidia AI Foundational models.
1717
# It uses a simple Streamlit UI and one file implementation of a minimalistic RAG pipeline.
1818

19-
20-
############################################
21-
# Component #0.5 - UI / Header
22-
############################################
23-
2419
import streamlit as st
2520
import os
21+
from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
22+
from langchain.text_splitter import CharacterTextSplitter
23+
from langchain_community.document_loaders import DirectoryLoader
24+
from langchain_community.vectorstores import FAISS
25+
import pickle
26+
from langchain_core.output_parsers import StrOutputParser
27+
from langchain_core.prompts import ChatPromptTemplate
2628

27-
# Page settings
28-
st.set_page_config(
29-
layout="wide",
30-
page_title="RAG Chatbot",
31-
page_icon = "🤖",
32-
initial_sidebar_state="expanded")
33-
34-
# Page title
35-
st.header('Generic RAG Chatbot Demo 🤖📝', divider='rainbow')
36-
37-
# Custom CSS
38-
def local_css(file_name):
39-
with open(file_name, "r") as f:
40-
st.markdown(f"<style>{f.read()}</style>", unsafe_allow_html=True)
41-
local_css("style.css")
42-
43-
# Page description
44-
st.markdown('''Manually looking through vast amounts of data can be tedious and time-consuming. This chatbot can expedite that process by providing a platform to query your documents.''')
45-
st.warning("This is a proof of concept, and any output from the AI agent should be used in conjunction with the original data.", icon="⚠️")
46-
47-
############################################
48-
# Component #1 - Document Loader
49-
############################################
29+
st.set_page_config(layout="wide")
5030

31+
# Component #1 - Document Upload
5132
with st.sidebar:
52-
st.subheader("Upload Your Documents")
53-
5433
DOCS_DIR = os.path.abspath("./uploaded_docs")
55-
56-
# Make dir to store uploaded documents
5734
if not os.path.exists(DOCS_DIR):
5835
os.makedirs(DOCS_DIR)
59-
60-
# Define form on Streamlit page for uploading files to KB
6136
st.subheader("Add to the Knowledge Base")
6237
with st.form("my-form", clear_on_submit=True):
6338
uploaded_files = st.file_uploader("Upload a file to the Knowledge Base:", accept_multiple_files=True)
6439
submitted = st.form_submit_button("Upload!")
6540

66-
# Acknowledge successful file uploads
6741
if uploaded_files and submitted:
6842
for uploaded_file in uploaded_files:
6943
st.success(f"File {uploaded_file.name} uploaded successfully!")
7044
with open(os.path.join(DOCS_DIR, uploaded_file.name), "wb") as f:
7145
f.write(uploaded_file.read())
7246

73-
############################################
74-
# Component #2 - Initalizing Embedding Model and LLM
75-
############################################
47+
# Component #2 - Embedding Model and LLM
48+
llm = ChatNVIDIA(model="meta/llama3-70b-instruct")
49+
document_embedder = NVIDIAEmbeddings(model="nvidia/nv-embedqa-e5-v5", model_type="passage")
7650

77-
from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
78-
79-
#Make sure to export your NGC NV-Developer API key as NVIDIA_API_KEY!
80-
API_KEY = os.environ['NVIDIA_API_KEY']
81-
82-
# Select embedding model and LLM
83-
document_embedder = NVIDIAEmbeddings(model="NV-Embed-QA", api_key=API_KEY, model_type="passage", truncate="END")
84-
llm = ChatNVIDIA(model="meta/llama3-70b-instruct", api_key=API_KEY, temperature=0)
85-
86-
############################################
8751
# Component #3 - Vector Database Store
88-
############################################
89-
90-
import pickle
91-
from langchain.text_splitter import RecursiveCharacterTextSplitter
92-
from langchain_community.document_loaders import DirectoryLoader
93-
from langchain_community.vectorstores import FAISS
94-
from langchain_core.output_parsers import StrOutputParser
95-
from langchain_core.prompts import ChatPromptTemplate
96-
from langchain_core.retrievers import BaseRetriever
97-
98-
# Option for using an existing vector store
9952
with st.sidebar:
10053
use_existing_vector_store = st.radio("Use existing vector store if available", ["Yes", "No"], horizontal=True)
10154

102-
# Load raw documents from the directory
103-
DOCS_DIR = os.path.abspath("./uploaded_docs")
55+
vector_store_path = "vectorstore.pkl"
10456
raw_documents = DirectoryLoader(DOCS_DIR).load()
10557

106-
# Check for existing vector store file
107-
vector_store_path = "vectorstore.pkl"
10858
vector_store_exists = os.path.exists(vector_store_path)
10959
vectorstore = None
110-
11160
if use_existing_vector_store == "Yes" and vector_store_exists:
112-
# Load existing vector store
11361
with open(vector_store_path, "rb") as f:
11462
vectorstore = pickle.load(f)
11563
with st.sidebar:
116-
st.info("Existing vector store loaded successfully.")
64+
st.success("Existing vector store loaded successfully.")
11765
else:
11866
with st.sidebar:
11967
if raw_documents and use_existing_vector_store == "Yes":
120-
# Chunk documents
12168
with st.spinner("Splitting documents into chunks..."):
122-
text_splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=100)
69+
text_splitter = CharacterTextSplitter(chunk_size=512, chunk_overlap=200)
12370
documents = text_splitter.split_documents(raw_documents)
12471

125-
# Convert document chunks to embeddings, and save in a vector store
12672
with st.spinner("Adding document chunks to vector database..."):
12773
vectorstore = FAISS.from_documents(documents, document_embedder)
12874

129-
# Save vector store
13075
with st.spinner("Saving vector store"):
13176
with open(vector_store_path, "wb") as f:
13277
pickle.dump(vectorstore, f)
13378
st.success("Vector store created and saved.")
13479
else:
13580
st.warning("No documents available to process!", icon="⚠️")
13681

137-
############################################
13882
# Component #4 - LLM Response Generation and Chat
139-
############################################
140-
141-
st.subheader("Query your data")
83+
st.subheader("Chat with your AI Assistant, Envie!")
14284

143-
# Save chat history for this user session
14485
if "messages" not in st.session_state:
14586
st.session_state.messages = []
14687

14788
for message in st.session_state.messages:
14889
with st.chat_message(message["role"]):
14990
st.markdown(message["content"])
15091

151-
# Define prompt for LLM
15292
prompt_template = ChatPromptTemplate.from_messages([
153-
("system", "You are a helpful AI assistant. Use the provided context to inform your responses. If no context is available, please state that."),
93+
("system", "You are a helpful AI assistant named Envie. If provided with context, use it to inform your responses. If no context is available, use your general knowledge to provide a helpful response."),
15494
("human", "{input}")
15595
])
15696

157-
# Define simple prompt chain
15897
chain = prompt_template | llm | StrOutputParser()
15998

160-
# Display an example query for user
161-
user_query = st.chat_input("Please summarize these documents.")
99+
user_input = st.chat_input("Can you tell me what NVIDIA is known for?")
162100

163-
if user_query:
164-
st.session_state.messages.append({"role": "user", "content": user_query})
101+
if user_input:
102+
st.session_state.messages.append({"role": "user", "content": user_input})
165103
with st.chat_message("user"):
166-
st.markdown(user_query)
104+
st.markdown(user_input)
167105

168106
with st.chat_message("assistant"):
169107
message_placeholder = st.empty()
170108
full_response = ""
171109

172110
if vectorstore is not None and use_existing_vector_store == "Yes":
173-
# Retrieve relevant chunks for the given user query from the vector store
174111
retriever = vectorstore.as_retriever()
175-
retrieved_docs = retriever.invoke(user_query)
176-
177-
# Concatenate retrieved chunks together as context for LLM
178-
context = "\n\n".join([doc.page_content for doc in retrieved_docs])
179-
augmented_user_input = f"Context: {context}\n\nQuestion: {user_query}\n"
112+
docs = retriever.invoke(user_input)
113+
context = "\n\n".join([doc.page_content for doc in docs])
114+
augmented_user_input = f"Context: {context}\n\nQuestion: {user_input}\n"
180115
else:
181-
augmented_user_input = f"Question: {user_query}\n"
116+
augmented_user_input = f"Question: {user_input}\n"
182117

183-
# Get output from LLM
184118
for response in chain.stream({"input": augmented_user_input}):
185119
full_response += response
186120
message_placeholder.markdown(full_response + "▌")
187121
message_placeholder.markdown(full_response)
188-
st.session_state.messages.append({"role": "assistant", "content": full_response})
122+
st.session_state.messages.append({"role": "assistant", "content": full_response})
Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,5 @@
1-
streamlit
1+
streamlit==1.30.0
22
faiss-cpu==1.7.4
3+
langchain==0.1.20
34
unstructured[all-docs]==0.11.2
4-
langchain
5-
langchain-community
6-
langchain-core
75
langchain-nvidia-ai-endpoints
8-
langchain-text-splitters
9-
nltk==3.8.1
10-
numpy==1.23.5
11-
onnx==1.16.1
12-
onnxruntime==1.15.1
13-
python-magic

community/5_mins_rag_no_gpu/style.css

Lines changed: 0 additions & 73 deletions
This file was deleted.

0 commit comments

Comments
 (0)