Direct Assessment (DA) scores for English-Maltese and Spanish-Basque, obtained from a crowd-based translation evaluation campaign.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
The source sentences in these datasets were compiled from various parallel corpora for English-Maltese and Spanish-Basque. Please refer to their respective sources and licenses for more detail.
| Corpus | Source | Licensing |
|---|---|---|
| FLORES-200 | github | CC-BY-SA 4.0 |
| CrowS-Pairs | github | CC-BY-SA 4.0 |
| EUbookshop | opus | N/A |
| ELITR-ECA | opus | CC-BY-4.0 |
| TED2020 | opus | See TED Talks Usage Policy |
| Elhuyar | opus | Creative Commons |
| OpenSubtitles | opus | N/A |
| EhuHac | opus | N/A |
| QED | opus | Public for research purposes only |
| WikiMatrix | opus | CC-BY-SA 4.0 |
| NeuLab-TedTalks | opus | See TED Talks Usage Policy |
If you use this data in your work, please cite the following paper:
Júlia Falcão, Claudia Borg, Nora Aranberri, and Kurt Abela. 2024. COMET for Low-Resource Machine Translation Evaluation: A Case Study of English-Maltese and Spanish-Basque. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 3553–3565, Torino, Italia. ELRA and ICCL. Download Paper