Add RAG Example using FAISS and Harmony Prompts #207
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR introduces a minimal Retrieval-Augmented Generation (RAG) example that integrates FAISS-based retrieval with gpt-oss models using Harmony-style prompts.
It is completely self-contained, non-invasive, and designed as an educational reference for ML engineers who want to ground open LLMs in local or private data sources.
🧠 What’s Included
New files only (no core modifications):
examples/rag_gpt_oss.py
— main example script implementing FAISS indexing, retrieval, and Harmony promptingexamples/utils/harmony_helpers.py
— helper functions for constructing and validating Harmony-formatted messagesexamples/requirements-rag.txt
— isolated dependencies for RAG exampleexamples/data/
— small local documents for FAISS indexing and retrievaldocs/examples/rag_gpt_oss.md
— setup and usage guide⚙️ Key Features
examples/data/.faiss/
)all-MiniLM-L6-v2
) for lightweight retrievalOPENAI_BASE_URL
OPENAI_API_KEY
GPT_OSS_MODEL
--no-stream
inference modesexamples/data/runs/
) with metadata and latency🧩 Example Usage
✅ Validation Checklist
Before submitting the PR, the following items have been verified:
examples/
,examples/utils/
,examples/data/
, anddocs/examples/
pyproject.toml
, core libraries, or CI configurationexamples/requirements-rag.txt
OPENAI_BASE_URL
OPENAI_API_KEY
GPT_OSS_MODEL
harmony_helpers.py
--no-stream
work as expectedexamples/data/runs/
with latency and metadatadocs/examples/rag_gpt_oss.md
black
and checked withruff
(if available)transformers
andvLLM
backends