Skip to content

Commit f7616fc

Browse files
Merge branch 'feature/nlp_gpt3mix' of https://github.com/ml6team/quick-tips into feature/nlp_gpt3mix
2 parents 3b3ef7e + bc5045e commit f7616fc

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

nlp/2021_11_25_augmentation_lm/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,4 @@ Typically, the more data we have, the better performance we can achieve 🤙. Ho
55
Large-scale language models (LMs) are excellent few-shot learners, allowing them to be controlled via natural text prompts. In this tip, we leverage three large-scale LMs (GPT-3, GPT-J and GPT-Neo) and prompt engineering to generate very realistic samples from a very small dataset. The model takes as input two real samples from our dataset, embeds them in a carefully designed prompt and generates an augmented mixed sample influenced by the sample sentences. We use the [Emotion](https://huggingface.co/datasets/emotion) dataset and distilled BERT pre-trained model and show that this augmentation method boosts the model performance and generates very realistic samples. For more information on text augmentation using large-scale LMs check [GPT3Mix](https://arxiv.org/pdf/2104.08826.pdf).
66

77
We recommend to open the notebook using Colab for an interactive explainable experience and optimal rendering of the visuals 👇:
8-
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ml6team/quick-tips/blob/feature%2Fnlp_gpt3mix/nlp/2021_11_25_gpt3mix/nlp_gpt3mix.ipynb)
8+
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ml6team/quick-tips/blob/main/nlp/2021_11_25_augmentation_lm/nlp_augmentation_lm.ipynb)

0 commit comments

Comments
 (0)