Skip to content

Commit 298713f

Browse files
Merge pull request #18 from ml6team/feature/nlp_data_aug
Fix typo
2 parents fa55813 + 8d36bb6 commit 298713f

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

nlp/2021_06_18_data_augmentation/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Transformer-based Data Augmentation
22

33
Ever struggled with having a limited non-English NLP dataset for a project? 🤯 Fear not, data augmentation to the rescue ⛑️
4-
In this week's tip, we look at backtranslation 🔀 and contextual word embedding insertions as data augmentation techniques for multilingual NLP. We'll be using the MariaMT and distilled BERT pre-trained models, available on huggingface.
4+
In this week's tip, we look at backtranslation 🔀 and contextual word embedding insertions as data augmentation techniques for multilingual NLP. We'll be using the MarianMT and distilled BERT pre-trained models, available on huggingface.
55

66
The training size will impact the performace of a model heavily, this notebook looks into the possibilities of performing data augmentation on an NLP dataset. Data augmentation techniques are used to generate additional samples. Data augmentation is already standard practice in computer vision projects 👌, but can also be leveraged in multilingual NLP problems.
77

0 commit comments

Comments
 (0)