In this repository you can run experiments with all methods described in paper.
This repository requires python==3.9
You can create virtual environment with requirements.txt
In order to use RuBert you need to install torch
and torchvision
with versions that suit your GPU and cuda
.
Synthetic dataset for training and benchmark dataset will download automatically when running main.py
.
All data will be stored in a ./data
folder that will also be created automatically.
You can run experiments with XGBoost, Random Forest, Logistic Regression, N-Gram, Rubert with following command:
python main.py
By default, it runs experiments with all methods, except RuBert, using TF-IDF feature extractor
- You can select models for experiments by changing the corresponding list
models
inmain.py
- You can also select feature extractor for experiments by changing the value of
final_feature_extractor
inmain.py