CS project for Zhuo Leng, Weijia Li, Xinzhu Sun, and Xingyun Wu
- module.py is our main script
- review1.py is one of our modules that contains functions to crawl and analyse reviews
- crawling.py is another module to crawl all product information and generate csvs
Before run:
- install pip (in case you don't have it)
- command: [sudo] pip install textblob
- command: python -m textblob.download_corpora
To run: python freq.py folder/
Please note if you want to add words that you don't want to ignore during analysis, simply add it into ignorelist at line 40, but make it uppercase as the others.
If you only want the frequency of a word in the current doc, it means you just want the value of TF instead of TF * IDF, go to line 21, replace it with sentence "return tf(word, doc) * 1.0"
To run: python similarity.py file1.txt file2.txt
To run: python simliarity_2.py
Before run: go to the folder named "user_interface"
To run: python3 manage.py runserver
activate your web browser, then go to http://127.0.0.1:8000/