-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
root
authored and
root
committed
Jan 14, 2025
0 parents
commit ae3545d
Showing
23 changed files
with
1,026 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,166 @@ | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
*.py,cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
cover/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
db.sqlite3 | ||
db.sqlite3-journal | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
.pybuilder/ | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# IPython | ||
profile_default/ | ||
ipython_config.py | ||
|
||
# pyenv | ||
# For a library or package, you might want to ignore these files since the code is | ||
# intended to run in multiple environments; otherwise, check them in: | ||
# .python-version | ||
|
||
# pipenv | ||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. | ||
# However, in case of collaboration, if having platform-specific dependencies or dependencies | ||
# having no cross-platform support, pipenv may install dependencies that don't work, or not | ||
# install all needed dependencies. | ||
#Pipfile.lock | ||
|
||
# poetry | ||
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. | ||
# This is especially recommended for binary packages to ensure reproducibility, and is more | ||
# commonly ignored for libraries. | ||
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control | ||
#poetry.lock | ||
|
||
# pdm | ||
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. | ||
#pdm.lock | ||
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it | ||
# in version control. | ||
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control | ||
.pdm.toml | ||
.pdm-python | ||
.pdm-build/ | ||
|
||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm | ||
__pypackages__/ | ||
|
||
# Celery stuff | ||
celerybeat-schedule | ||
celerybeat.pid | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
<<<<<<< HEAD | ||
======= | ||
.venv/ | ||
>>>>>>> master | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
|
||
# Pyre type checker | ||
.pyre/ | ||
|
||
# pytype static type analyzer | ||
.pytype/ | ||
|
||
# Cython debug symbols | ||
cython_debug/ | ||
|
||
# PyCharm | ||
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can | ||
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore | ||
# and can be added to the global gitignore or merged into this file. For a more nuclear | ||
# option (not recommended) you can uncomment the following to ignore the entire idea folder. | ||
#.idea/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
[theme] | ||
base="dark" | ||
primaryColor="#0084d5" | ||
backgroundColor="#0e1117" | ||
secondaryBackgroundColor="#001f3f" | ||
|
||
[server] | ||
port=8501 # change port number. By default streamlit uses 8501 port | ||
headless=true # This will eliminate automatically open browser | ||
# enableCORS=false | ||
# enableXsrfProtection=false | ||
# enableWebsocketCompression=false | ||
|
||
# Enable the following code block to deploy on an nginx server: | ||
|
||
#[browser] # This ip and port will show in command prompt | ||
#serverAddress = "10.10.0.218"# Put your Local IP or Domain Name | ||
#serverPort = 9000 #nginx port | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
# Welcome to Arrow RAG Assistant Demo! | ||
|
||
This demo demonstrates how to easily deploy and customize a **Retrieval Augmented Generation (RAG)** model using **NVIDIA NIMs** or locally on **NVIDIA DGX** or another other local hardware, enabling high-performance inference for enterprise use cases. | ||
|
||
|
||
RAG, or Retrieval Augmented Generation, is a framework that combines the strengths of large language models (LLMs) and external knowledge retrieval systems. It allows the model to fetch relevant information from external sources, such as databases or knowledge bases, during the generation process, improving accuracy and relevance while reducing hallucinations. | ||
|
||
 | ||
|
||
## Features | ||
|
||
This demo offers flexibility and customization in the following areas: | ||
|
||
**Deployment Options**: Choose between hosting the LLM locally or using NVIDIA NIMs. | ||
|
||
 | ||
|
||
**Retrieval Sources**: You can select from different retrieval sources, including uploading a PDF, using previously chosen PDFs about Arrow Electronics in the "docs" folder, or specifying URLs in rag_engine.py file. | ||
|
||
 | ||
|
||
**LLM Selection**: The demo currently features LLaMA 3.1, Phi 3.5, and Gemma 2. However, you can swap out these models for any compatible LLM. | ||
|
||
**LLM Parameters**: Use an interactive slider to adjust model parameter temperature. The code can also be customized to modify additional properties such as top-k or top-p values, allowing fine-tuning of the model's output. | ||
|
||
 | ||
|
||
|
||
**Application & UI Customization**: | ||
|
||
This demo is built using LangChain for the RAG process and Streamlit for the frontend, providing a seamless, interactive experience. You can personalize the theme, font, and branding to suit your preferences. | ||
|
||
|
||
## Running the Demo | ||
|
||
Follow these steps to set up and run the demo: | ||
|
||
**1. Install Ollama and LLM Models** | ||
|
||
Install [Ollama](https://ollama.com) and ensure the LLM models you want to use are installed. Run the following command for each model: | ||
|
||
```ollama pull <modelname>``` | ||
|
||
Example: | ||
|
||
```ollama pull llama3.1 ``` | ||
|
||
```ollama pull phi3.5``` | ||
|
||
```ollama pull gemma2``` | ||
|
||
|
||
**2. Install Required Packages** | ||
|
||
Install the necessary Python packages listed in requirements.txt: | ||
|
||
```pip install -r requirements.txt``` | ||
|
||
**3. Set NVIDIA_API_KEY**: | ||
|
||
Generate your API key from [NVIDIA NIMs API Catalog](https://build.nvidia.com/explore/discover) and run the following command in your environment's terminal: | ||
|
||
```NVIDIA_API_KEY='nvapi-???' ``` | ||
|
||
This step is required to run the demo properly. | ||
|
||
**4. Run the Frontend** | ||
Start the Streamlit app: | ||
|
||
```streamlit run rag.py``` | ||
|
||
**4. Open Port 8501** | ||
|
||
Make sure port 8501 is open on your localhost to access the demo in your browser. You can change the port in config.toml file in the .streamlit folder. | ||
|
||
**5. Experiment with the Demo** | ||
|
||
Explore the demo and experiment with different deployment, retrieval, and model options to see how RAG can improve inference for your use cases. | ||
|
||
 | ||
|
||
 | ||
|
||
## Additional Resources | ||
|
||
* [NVIDIA NIM API Catalog](https://build.nvidia.com/explore/discover) | ||
|
||
* [LangChain Documentation](https://python.langchain.com/docs/introduction/) | ||
|
||
* [Streamlit Documentation](https://docs.streamlit.io) | ||
|
||
* [Ollama Installation](https://ollama.com) |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.