Skip to content

Commit 71dd0ca

Browse files
authored
Update README.md
1 parent 6e61fc6 commit 71dd0ca

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

README.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -360,15 +360,15 @@ An important distinction to make here is that I'm still supplying the ground-tru
360360

361361
Since I'm teacher-forcing during validation, the BLEU score measured above on the resulting captions _does not_ reflect real performance. In fact, the BLEU score is a metric designed for comparing naturally generated captions to ground-truth captions of differing length. Once batched inference is implemented, i.e. no Teacher Forcing, early-stopping with the BLEU score will be truly 'proper'.
362362

363-
With this in mind, I used [`eval.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning/blob/master/eval.py) to compute the correct BLEU-4 scores of this model checkpoint on the validation set _without_ Teacher Forcing, at different beam sizes –
363+
With this in mind, I used [`eval.py`](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning/blob/master/eval.py) to compute the correct BLEU-4 scores of this model checkpoint on the validation and test sets _without_ Teacher Forcing, at different beam sizes –
364364

365-
Beam Size | Validation BLEU-4
366-
:---: | :---:
367-
1 | 29.98
368-
3 | 32.95
369-
5 | 33.17
365+
Beam Size | Validation BLEU-4 | Test BLEU-4 |
366+
:---: | :---: | :---: |
367+
1 | 29.98 | 30.28 |
368+
3 | 32.95 | 33.06 |
369+
5 | 33.17 | 33.29 |
370370

371-
This is higher than the result in the paper, and could be because of how our BLEU calculators are parameterized, the fact that I used a ResNet encoder, and actually fine-tuned the encoder – even if just a little.
371+
The test score is higher than the result in the paper, and could be because of how our BLEU calculators are parameterized, the fact that I used a ResNet encoder, and actually fine-tuned the encoder – even if just a little.
372372

373373
Also, remember – when fine-tuning during Transfer Learning, it's always better to use a learning rate considerably smaller than what was originally used to train the borrowed model. This is because the model is already quite optimized, and we don't want to change anything too quickly. I used `Adam()` for the Encoder as well, but with a learning rate of `1e-4`, which is a tenth of the default value for this optimizer.
374374

0 commit comments

Comments
 (0)