Project for the Human Languages Technologies course @ University of Pisa
Authors: Alberto Marinelli, Valerio Mariani
Pretrained language models are nowadays becoming standard approaches to tackle various Natural Language Processing tasks. This is the reason why we decided to experiment with using Transformers in a Siamese network to solve these problems and understand how they work. Specifically, due to the extensive pre-training of the available language models and the small dataset, a pre-trained model of BERT and its variant RoBERTa inside a Siamese network were chosen.
In particular, this architecture was used to tackle task 1 in the IBM's shared task 2021. In this task, we are given a set of debatable topics, a set of key points, and a set of arguments, supporting or contesting the topic. We are then asked to match each argument to the topic it is supporting or contesting.
All the code for experimentation is reported in apposite notebook that can be runned on Google Colab, to speed up the computation required for the training and inference phase it is suggested to change the runtime type to GPU.
Despite the relatively small size of the Dataset we were given - expecially when compared to the corpus these models were trained on - both SBERT and Siamese-RoBERTa obtained some nice generalization of the ability to select the Key-Point of a sentence in a set of possible ones from the training set to the development (validation) set we were given. In fact, using the given metrics on the validation set, we obtained ~75% Mean Average Precision with SBERT and ~80% Mean Average Precision with Siamese-RoBERTA.