You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: nlp/2021_06_18_data_augmentation/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
# Transformer-based Data Augmentation
2
2
3
3
Ever struggled with having a limited non-English NLP dataset for a project? 🤯 Fear not, data augmentation to the rescue ⛑️
4
-
In this week's tip, we look at backtranslation 🔀 and contextual word embedding insertions as data augmentation techniques for multilingual NLP. We'll be using the MariaMT and distilled BERT pre-trained models, available on huggingface.
4
+
In this week's tip, we look at backtranslation 🔀 and contextual word embedding insertions as data augmentation techniques for multilingual NLP. We'll be using the MarianMT and distilled BERT pre-trained models, available on huggingface.
5
5
6
6
The training size will impact the performace of a model heavily, this notebook looks into the possibilities of performing data augmentation on an NLP dataset. Data augmentation techniques are used to generate additional samples. Data augmentation is already standard practice in computer vision projects 👌, but can also be leveraged in multilingual NLP problems.
0 commit comments