@@ -36,7 +36,7 @@ pip install tensor2tensor && t2t-trainer \
36
36
--generate_data \
37
37
--data_dir=~/t2t_data \
38
38
--output_dir=~/t2t_train/mnist \
39
- --problems =image_mnist \
39
+ --problem =image_mnist \
40
40
--model=shake_shake \
41
41
--hparams_set=shake_shake_quick \
42
42
--train_steps=1000 \
@@ -78,13 +78,13 @@ to modify the hyperparameters if you run on a different setup.
78
78
### Image Classification
79
79
80
80
For image classification, we have a number of standard data-sets:
81
- * ImageNet (a large data-set): ` --problems =image_imagenet ` , or one
81
+ * ImageNet (a large data-set): ` --problem =image_imagenet ` , or one
82
82
of the re-scaled versions (` image_imagenet224 ` , ` image_imagenet64 ` ,
83
83
` image_imagenet32 ` )
84
- * CIFAR-10: ` --problems =image_cifar10 ` (or
85
- ` --problems =image_cifar10_plain ` to turn off data augmentation)
86
- * CIFAR-100: ` --problems =image_cifar100 `
87
- * MNIST: ` --problems =image_mnist `
84
+ * CIFAR-10: ` --problem =image_cifar10 ` (or
85
+ ` --problem =image_cifar10_plain ` to turn off data augmentation)
86
+ * CIFAR-100: ` --problem =image_cifar100 `
87
+ * MNIST: ` --problem =image_mnist `
88
88
89
89
For ImageNet, we suggest to use the ResNet or Xception, i.e.,
90
90
use ` --model=resnet --hparams_set=resnet_50 ` or
@@ -99,11 +99,11 @@ close to 97% accuracy on CIFAR-10.
99
99
### Language Modeling
100
100
101
101
For language modeling, we have these data-sets in T2T:
102
- * PTB (a small data-set): ` --problems =languagemodel_ptb10k ` for
103
- word-level modeling and ` --problems =languagemodel_ptb_characters `
102
+ * PTB (a small data-set): ` --problem =languagemodel_ptb10k ` for
103
+ word-level modeling and ` --problem =languagemodel_ptb_characters `
104
104
for character-level modeling.
105
- * LM1B (a billion-word corpus): ` --problems =languagemodel_lm1b32k ` for
106
- subword-level modeling and ` --problems =languagemodel_lm1b_characters `
105
+ * LM1B (a billion-word corpus): ` --problem =languagemodel_lm1b32k ` for
106
+ subword-level modeling and ` --problem =languagemodel_lm1b_characters `
107
107
for character-level modeling.
108
108
109
109
We suggest to start with ` --model=transformer ` on this task and use
@@ -113,7 +113,7 @@ We suggest to start with `--model=transformer` on this task and use
113
113
### Sentiment Analysis
114
114
115
115
For the task of recognizing the sentiment of a sentence, use
116
- * the IMDB data-set: ` --problems =sentiment_imdb `
116
+ * the IMDB data-set: ` --problem =sentiment_imdb `
117
117
118
118
We suggest to use ` --model=transformer_encoder ` here and since it is
119
119
a small data-set, try ` --hparams_set=transformer_tiny ` and train for
@@ -122,15 +122,15 @@ few steps (e.g., `--train_steps=2000`).
122
122
### Speech Recognition
123
123
124
124
For speech-to-text, we have these data-sets in T2T:
125
- * Librispeech (English speech to text): ` --problems =librispeech ` for
126
- the whole set and ` --problems =librispeech_clean ` for a smaller
125
+ * Librispeech (English speech to text): ` --problem =librispeech ` for
126
+ the whole set and ` --problem =librispeech_clean ` for a smaller
127
127
but nicely filtered part.
128
128
129
129
### Summarization
130
130
131
131
For summarizing longer text into shorter one we have these data-sets:
132
132
* CNN/DailyMail articles summarized into a few sentences:
133
- ` --problems =summarize_cnn_dailymail32k `
133
+ ` --problem =summarize_cnn_dailymail32k `
134
134
135
135
We suggest to use ` --model=transformer ` and
136
136
` --hparams_set=transformer_prepend ` for this task.
@@ -139,15 +139,15 @@ This yields good ROUGE scores.
139
139
### Translation
140
140
141
141
There are a number of translation data-sets in T2T:
142
- * English-German: ` --problems =translate_ende_wmt32k `
143
- * English-French: ` --problems =translate_enfr_wmt32k `
144
- * English-Czech: ` --problems =translate_encs_wmt32k `
145
- * English-Chinese: ` --problems =translate_enzh_wmt32k `
146
- * English-Vietnamese: ` --problems =translate_envi_iwslt32k `
142
+ * English-German: ` --problem =translate_ende_wmt32k `
143
+ * English-French: ` --problem =translate_enfr_wmt32k `
144
+ * English-Czech: ` --problem =translate_encs_wmt32k `
145
+ * English-Chinese: ` --problem =translate_enzh_wmt32k `
146
+ * English-Vietnamese: ` --problem =translate_envi_iwslt32k `
147
147
148
148
You can get translations in the other direction by appending ` _rev ` to
149
149
the problem name, e.g., for German-English use
150
- ` --problems =translate_ende_wmt32k_rev ` .
150
+ ` --problem =translate_ende_wmt32k_rev ` .
151
151
152
152
For all translation problems, we suggest to try the Transformer model:
153
153
` --model=transformer ` . At first it is best to try the base setting,
@@ -193,7 +193,7 @@ t2t-datagen \
193
193
# * If you run out of memory, add --hparams='batch_size=1024'.
194
194
t2t-trainer \
195
195
--data_dir=$DATA_DIR \
196
- --problems =$PROBLEM \
196
+ --problem =$PROBLEM \
197
197
--model=$MODEL \
198
198
--hparams_set=$HPARAMS \
199
199
--output_dir=$TRAIN_DIR
@@ -210,7 +210,7 @@ ALPHA=0.6
210
210
211
211
t2t-decoder \
212
212
--data_dir=$DATA_DIR \
213
- --problems =$PROBLEM \
213
+ --problem =$PROBLEM \
214
214
--model=$MODEL \
215
215
--hparams_set=$HPARAMS \
216
216
--output_dir=$TRAIN_DIR \
@@ -325,7 +325,7 @@ and hyperparameter set functions can compose other hyperparameter set functions.
325
325
326
326
The ** trainer** binary is the main entrypoint for training, evaluation, and
327
327
inference. Users can easily switch between problems, models, and hyperparameter
328
- sets by using the ` --model ` , ` --problems ` , and ` --hparams_set ` flags. Specific
328
+ sets by using the ` --model ` , ` --problem ` , and ` --hparams_set ` flags. Specific
329
329
hyperparameters can be overridden with the ` --hparams ` flag. ` --schedule ` and
330
330
related flags control local and distributed training/evaluation
331
331
([ distributed training documentation] ( https://github.com/tensorflow/tensor2tensor/tree/master/docs/distributed_training.md ) ).
0 commit comments