Skip to content

Cyl200215/DSC180A-FinalProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DSC180A-FinalProject

Multiclass-Classification, code to preprocess datasets and train multiclass classcifer In this notebook, it contains all classification models that I did base on my factuality factors.

To Run Multiclass-Classification.ipynb, please download below nltk:

nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('averaged_perceptron_tagger')

DSC180-LiarHackthon

Files

  1. Liar Classification.ipynb, code to preprocess datasets and train multiclass classcifer. In this notebook, it contains the Liar Hackthon
  2. politifact_plus_data.csv, data newly scraped from Politifact
  3. train2.tsv, test2.tsv, val2.tsv, data originally from Liar-Plus dataset.

Dataset

Dataset politifact_plus_data is around 890 newly scrapped data from Politifact.com. The dataset contains label, statement, and justification from the TRUTH-O-METER on Politifact. Justification is scraped from the summary of the fact linked article.

Dataset test2.tsv, train2.tsv, val2.tsv comes from the Liar-Plus dataset.

Total of 11129 data were used to train the 6 label classification model.

Model

Reference to Triple Branch BERT Siamese Network.

Instead of using three branches of BERT model, I connected only two branch of BERT model for faster process. The two branch uses column statement and justification to tokenize and predict the labels.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors