Skip to content

Commit

Permalink
Update jupyter notebook references (#227)
Browse files Browse the repository at this point in the history
Signed-off-by: Emmanuel Ferdman <[email protected]>
  • Loading branch information
emmanuel-ferdman authored Nov 6, 2024
1 parent 1054225 commit 2cfd448
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion finetuning/Codegemma/lora.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"\n",
"CodeGemma is a groundbreaking new open model in the Gemini family of models from Google. CodeGemma is just as powerful as previous models but compact enough to run locally on NVIDIA RTX GPUs. CodeGemma is available in 2 sizes: 2B and 7B parameters. With NVIDIA NeMo, you can customize CodeGemma to fit your usecase and deploy an optimized model on your NVIDIA GPU.\n",
"\n",
"In this tutorial, we'll go over a specific kind of customization -- Low-rank adapter tuning to follow a specific output format (also known as LoRA). To learn how to perform full parameter supervised fine-tuning for instruction following (also known as SFT), see the [SFT notebook on Gemma Base Model](https://github.com/NVIDIA/GenerativeAIExamples/blob/main/models/Gemma/sft.ipynb). For LoRA, we'll perform all operations within the notebook on a single GPU. The compute resources needed for training depend on which CodeGemma model you use. For the 7 billion parameter variant, you'll need a GPU with 80GB of memory. For the 2 billion parameter model, 40GB will do.\n",
"In this tutorial, we'll go over a specific kind of customization -- Low-rank adapter tuning to follow a specific output format (also known as LoRA). To learn how to perform full parameter supervised fine-tuning for instruction following (also known as SFT), see the [SFT notebook on Gemma Base Model](https://github.com/NVIDIA/GenerativeAIExamples/blob/main/finetuning/Gemma/sft.ipynb). For LoRA, we'll perform all operations within the notebook on a single GPU. The compute resources needed for training depend on which CodeGemma model you use. For the 7 billion parameter variant, you'll need a GPU with 80GB of memory. For the 2 billion parameter model, 40GB will do.\n",
"\n",
"We'll also learn how to export your custom model to TensorRT-LLM, an open-source library that accelerates and optimizes inference performance of the latest LLMs on the NVIDIA AI platform."
]
Expand Down
2 changes: 1 addition & 1 deletion finetuning/StarCoder2/inference.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In the previous [notebook](https://github.com/NVIDIA/GenerativeAIExamples/blob/main/models/StarCoder2/lora.ipynb), we show how to parameter efficiently finetune StarCoder2 model with a custom code (instruction, completion) pair dataset. We choose LoRA as our PEFT algorithnm and finetune for 50 interations. In this notebook, the goal is to demonstrate how to compile fintuned .nemo model into optimized TensorRT-LLM engines. The converted model engine can perform accelerated inference locally or be deployed to Triton Inference Server."
"In the previous [notebook](https://github.com/NVIDIA/GenerativeAIExamples/blob/main/finetuning/StarCoder2/lora.ipynb), we show how to parameter efficiently finetune StarCoder2 model with a custom code (instruction, completion) pair dataset. We choose LoRA as our PEFT algorithnm and finetune for 50 interations. In this notebook, the goal is to demonstrate how to compile fintuned .nemo model into optimized TensorRT-LLM engines. The converted model engine can perform accelerated inference locally or be deployed to Triton Inference Server."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion finetuning/StarCoder2/lora.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"\n",
"In this tutorial, we'll go over a popular Parameter-Efficient Fine-Tuning (PEFT) customization technique -- i.e. Low-Rank Adaptation (also known as LoRA) which enables the already upgraded StarCoder2 model to learn a new coding language or coding style.\n",
"\n",
"Note that the subject 15B StarCoder2 model takes 30GB disk space and requires more than 80GB CUDA memory while performing PEFT on a single GPU. Therefore, the verified hardware configuration for this notebook and the subsequent [inference notebook](https://github.com/NVIDIA/GenerativeAIExamples/blob/main/models/StarCoder2/inference.ipynb) employ a single node machine with 8 80GB NVIDIA GPUs."
"Note that the subject 15B StarCoder2 model takes 30GB disk space and requires more than 80GB CUDA memory while performing PEFT on a single GPU. Therefore, the verified hardware configuration for this notebook and the subsequent [inference notebook](https://github.com/NVIDIA/GenerativeAIExamples/blob/main/finetuning/StarCoder2/inference.ipynb) employ a single node machine with 8 80GB NVIDIA GPUs."
]
},
{
Expand Down

0 comments on commit 2cfd448

Please sign in to comment.