The purpose of this application is to provide a platform for students to find potential advisors and for faculty members to explore research trends.
It targets students and faculty at universities, aiming to help students identify the most suitable professors based on research interests and publications. Additionally, faculty can track research trends and collaboration opportunities.
The objective is to provide an easy-to-use interface for querying academic data, viewing research trends, and managing favorite professors.
- Clone the repository from
https://github.com/NICKLIN13/Academic-Research-Explorer.git - Create the
favorite_facultytable in MySQL- After cloning, in your local MySQL database, add a new table named
favorite_faculty, which is used by widget 5 and widget 6. - This table is not included in the original dataset. Run the following SQL query to create it:
CREATE TABLE favorite_faculty ( faculty_name VARCHAR(255) NOT NULL, university_name VARCHAR(255), related_keyword TEXT, PRIMARY KEY (faculty_name, university_name) );
- After cloning, in your local MySQL database, add a new table named
- Create the dataset in MongoDB
- For the mongodb, we didn't have
publication_keywordorkeywordsin academicworld, so we took the existing excel sheets we had and wrote scripts to convert csv into json, then loaded the json into academicworld database.
- For the mongodb, we didn't have
- Create a virtual environment
- Install dependencies
- Run
pip install -r requirements.txtto install the necessary libraries listed in the requirements.txt file.
- Run
- Run the application
- On macOS, run
python3 backend.pyfirst, then runpython3 frontend.py. - On Windows, run
python backend.pyfirst, then runpython frontend.py.
- On macOS, run
- Users can start by viewing the "Top 10 U.S. universities with the most research publications" widget, shown in a bar chart to highlight research-oriented institutions.
- Then, in the "Top 10 Publications" widget, by searching a university, users can see its 10 most cited publications, showing which works are most valued in academic community.
- In the "Popular Research Keywords" widget, users can enter a year range to see the most cited keywords in that peroid of time, helping identify trending research areas.
- In the "Professors and Publications" widget, users select a keyword to see related professors and their publications, revealing connections in that field.
- In the "Search Faculty" widget, users can search for a professor by name and view their university and top 3 most related research keywords. Clicking "Add to Favorites" adds the professor to the "favorite_faculty" table in MySQL database.
- The "Favorite Faculty" widget shows a list of favorite professors. Users can click "Clear" to remove all entries from the database. The data will remain until the "Clear" button is pressed, even if the page is accidentally refreshed.
- The application consists of a Dash frontend and three separate backend databases: MySQL, MongoDB, and Neo4j. These databases are accessed through REST APIs.
- User interacts with a widget → Dash frontend sends a request → Backend queries the appropriate database → Response is returned → Data is displayed on the frontend.
- Each widget is connected to a specific backend depending on the type of query or update it performs.
- Six different interactive widgets for users to query and update the database.
- Present the data clearly using bar charts, search bars, dropdown menus, buttons, and tables.
- The overall color scheme uses UIUC's ILI-orange and ILI-blue for a clear and distinct presentation.
We implemented the app using Dash and Flask. The frontend was built with Dash components such as dcc, html, dash_table, and visualized bar chart using Plotly Express.
For HTTP communication, we used Flask along with Flask-CORS to handle cross-origin requests, and requests for calling REST APIs. We used Pandas to process tabular data and integrate results from different databases. The backend connects to MySQL, MongoDB, and Neo4j using RESTful APIs.
- MySQL: Use Constraints such as PRIMARY KEY to manage data integrity. For example, in
favorite_faculty, a composite primary key(faculty_name, university_name)ensures no duplicate favorites. - MongoDB: Use Aggregation Pipeline to process data in stages. For example, in
get_top_keywords, we use stages likematch‘,‘match,match‘,‘unwind,group‘,‘group,group‘,‘sort, and$limitto filter by year, expand arrays, group by keyword, calculate metrics, and return the most cited keywords. - Neo4j: Use Indexes to improve query performance on frequently searched properties. For example, CREATE INDEX IF NOT EXISTS FOR (k:KEYWORD) ON (k.name) speeds up keyword-based searches across the graph database's nodes and relationships.
The app uses REST ful endpoints to interact with all three databases in a unified way.
- Tasks done:
- Implemented widget 3 and completed MongoDB related features. Wrote Python scripts to convert
publication_keyword.csvandkeywords.csvinto json files, then loaded json files into theacademicworlddatabase. - Implemented widget 4 and completed Neo4j related features.
- Recorded the video demo and worked on README.
- Implemented widget 3 and completed MongoDB related features. Wrote Python scripts to convert
- Time spent: 20 hours
- Tasks done:
- Designed the overall app architecture and defined widget functionalities.
- Implemented widgets 1, 2, 5, and 6, and completed MySQL related features.
- Integrated project code and worked on README.
- Time spent: 20 hours