Skip to content

Commit 4606f12

Browse files
authored
Readme update (#9)
Update to readme with tables and new examples
1 parent 54b29a8 commit 4606f12

File tree

1 file changed

+31
-22
lines changed

1 file changed

+31
-22
lines changed

README.md

Lines changed: 31 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,48 @@
11
# NVIDIA Generative AI Examples
22

33
## Introduction
4+
State-of-the-art Generative AI examples that are easy to deploy, test, and extend. All examples run on the high performance NVIDIA CUDA-X software stack and NVIDIA GPUs.
45

5-
This repository provides Generative AI examples targetted for different usecases. Modern enterprise applications are becoming more cloud-native and based on a microservices architecture. Microservices, by definition, consist of a collection of small independent services that communicate over well-defined APIs. AI applications, in most instances, adhere well to this same architectural design, as there are typically many different components that all need to work together in both training and inferencing workflows.
6+
## NVIDIA NGC
7+
Generative AI Examples uses resources from the [NVIDIA NGC AI Development Catalog](https://ngc.nvidia.com).
68

7-
To deploy an application in a production environment, the application must also meet the following criteria:
9+
Sign up for a [free NGC developer account](https://ngc.nvidia.com/signin) to access:
810

9-
- Reliability
10-
- Security
11-
- Performance
12-
- Scalability
13-
- Interoperability
11+
- The GPU-optimized NVIDIA containers, models, scripts, and tools used in these examples
12+
- The latest NVIDIA upstream contributions to the respective programming frameworks
13+
- The latest NVIDIA Deep Learning and LLM software libraries
14+
- Release notes for each of the NVIDIA optimized containers
15+
- Links to developer documentation
1416

15-
## What are NVIDIA AI Workflows?
16-
-----------------------------
17-
NVIDIA AI Workflows are intended to provide reference solutions of how to leverage NVIDIA frameworks to build AI solutions for solving common use cases. These workflows provide guidance like fine tuning and AI model creation to build upon NVIDIA frameworks. The pipelines to create applications are highlighted, as well as opinions on how to deploy customized applications and integrate them with various components typically found in enterprise environments, such as components for orchestration and management, storage, security, networking, etc.
17+
## Retrieval Augmented Generation (RAG)
1818

19-
By leveraging an AI workflow for your specific use case, you can streamline development of AI solutions following the example provided by the workflow to:
19+
A RAG pipeline embeds multimodal data -- such as documents, images, and video -- into a database connected to a Large Language Model. RAG lets users use an LLM to chat with their own data.
2020

21-
- Reduce development time, at lower cost
22-
- Improve accuracy and performance
23-
- Gain confidence in outcome, by leveraging NVIDIA AI expertise
21+
| Name | Description | LLM | Framework | Multi-GPU | Multi-node | Embedding | TRT-LLM | Triton | VectorDB | K8s |
22+
|---------------|-----------------------|------------|-------------------------|-----------|------------|-------------|---------|--------|----------|-----|
23+
| [Linux developer RAG](https://github.com/NVIDIA/GenerativeAIExamples/tree/main/RetrievalAugmentedGeneration) | Single VM, single GPU | llama2-13b | Langchain + Llama Index | No | No | e5-large-v2 | Yes | Yes | Milvus | No |
24+
| [Windows developer RAG](https://github.com/NVIDIA/trt-llm-rag-windows) | RAG on Windows | llama2-13b | Llama Index | No | No | NA | Yes | No | FAISS | NA |
2425

25-
Using the example workflow provided in this repository, you know exactly what AI framework to use, how to bring data into the pipeline, and what to do with the data output. AI Workflows are designed as microservices, which means they can be deployed on Kubernetes alone or with other microservices to create a production-ready application for seamless scaling. The workflow cloud deployable package can be used across different cloud instances and is automatable and interoperable.
2626

27-
NVIDIA AI Workflows are available on NVIDIA NGC for [NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/) software customers.
27+
## Large Language Models
28+
NVIDIA LLMs are optimized for building enterprise generative AI applications.
2829

29-
## Examples
30-
--------------------------
30+
| Name | Description | Type | Context Length | Example | License |
31+
|---------------|-----------------------|------------|----------------|---------|---------|
32+
| [nemotron-3-8b-qa-4k](https://huggingface.co/nvidia/nemotron-3-8b-qa-4k) | Q&A LLM customized on knowledge bases | Text Generation | 4096 | No | [NVIDIA AI Foundation Models Community License Agreement](https://developer.nvidia.com/downloads/nv-ai-foundation-models-license) |
33+
| [nemotron-3-8b-chat-4k-steerlm](https://huggingface.co/nvidia/nemotron-3-8b-chat-4k-steerlm) | Best out-of-the-box chat model with flexible alignment at inference | Text Generation | 4096 | No | [NVIDIA AI Foundation Models Community License Agreement](https://developer.nvidia.com/downloads/nv-ai-foundation-models-license) |
34+
| [nemotron-3-8b-chat-4k-rlhf](https://huggingface.co/nvidia/nemotron-3-8b-chat-4k-rlhf) | Best out-of-the-box chat model performance| Text Generation | 4096 | No | [NVIDIA AI Foundation Models Community License Agreement](https://developer.nvidia.com/downloads/nv-ai-foundation-models-license) |
3135

32-
This AI Workflow includes different examples illustrating generative AI workflow. While all should be relatively easy to follow, they are targeted towards different intended audiences. For more information about the detailed components and software stacks, please refer to the guides for each workflow.
3336

34-
- [Retrieval Augmented Generation](./RetrievalAugmentedGeneration/README.md): A reference RAG workflow to a chatbot which can answer questions off public press releases & tech blogs.
37+
## Integration Examples
3538

36-
*Note::*
39+
## NVIDIA support
40+
In each of the READMEs, we indicate the level of support provided.
41+
42+
## Feedback / Contributions
43+
We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!
44+
45+
## Known issues
46+
- In each of the READMEs, we indicate any known issues and encourage the community to provide feedback.
3747
- The datasets provided as part of this project is under a different license for research and evaluation purposes.
3848
- This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
39-
- The components and instructions used in the workflow are intended to be used as examples for integration, and may not be sufficiently production-ready or enterprise ready on their own as stated. The workflow should be customized and integrated into one’s own infrastructure, using the workflow as reference. For example, all of the instructions in these workflows assume a single node infrastructure, whereas production deployments should be performed in a high availability (HA) environment.

0 commit comments

Comments
 (0)