Information Technology is increasingly utilized to spread false or misleading information. Since it can be difficult for readers to distinguish between real and fake articles at first glance, our Natural Language Processing (NLP) model does the job for them!
When the user suspects an online article of being fake, they can pass the article URL to the program. The program uses web scraping to gather the article headline and contents, then passes the text to a BERT-based NLP model which predicts whether the article is real or fake, as well as the percentage likelihood.
The BERT model is based off of this Kaggle Notebook with slight variations. The model is trained in Google Colab, and the trained weights file is downloaded. A local Python script reconstructs the model and loads the weights file so it can make predictions based on the input news article. Finally, the UI is updated with the model outputs.
Our main challenges include team members' lack of sleep, and the Flask backend not interfacing properly with the model. However, with encouragement and support, we finished the project on time despite the sleep deprivation. In addition, we were able to diagnose and fix the Flask error after some Googling.
We are looking at building a mobile version/chrome extension, thus making Fakeout more convenient and easy to use. As well, we want to expand our model to analyze audio and video contexts, thus expanding fake news detection beyond texual media. Lastly, the model could be retrained with a larger and more diverse dataset to increase accuracy.