This repository has now been archived as of 16/12/2021. This repository has not been maintained since May 2020
A programme written by Sailesh Patel (160034811) designed to scrape information from course programme specification PDFs, as a part of the FYP project, A Chatbot for Assisting University Admission Process, supervised by Dr Sylvia Wong at Aston University.
- Clone the repository
 - Install the required technologies listed above (the links are to their respective installation instructions)
 
Note PIP is not required, but would be beneficial to install Tabula-Py, BeautifulSoup, and Requests
- Please ensure that all the software requirements have been met before executing the program
 - To execute the program, run the command 
python3 programme-scraper.py - To run the PDF scraper
- Type 
Pand pressEnter - Type the PDF file in without the 
.pdfextension and pressEnterBScComputerScienceshows the PDF scraper workingBScDigitalDegreeApprenticeshipshows the PDF scraper not working
 
 - Type 
 - To run the web scraper
- Type 
Wand pressEnter- Type 
EASfor the school and pressEnter - Type the website you would like to scrape
- Type 
https://www2.aston.ac.uk/study/courses/computer-science-bscto show the web scraper working - Type 
https://www2.aston.ac.uk/study/courses/chemistry-bscto show the web scraper fail to format the text inside the Entry Requirements & Fees for 2020 
 - Type 
 
 - Type 
 
 - Type 
 
All Rights Reserved