Skip to content

Latest commit

 

History

History
20 lines (15 loc) · 944 Bytes

README.md

File metadata and controls

20 lines (15 loc) · 944 Bytes

wuzzuf_scraping

Python project with Jupyter notebook

The site will be scraped for some info found in the search result page, this info are simple and can be extracted using beautiful soup.

Then for each search result, using the link extracted from the previous step, other data will be scraped from inside each post page (like salary), this data can't be extracted using beautiful soup only because it's filled in using JavaScript, so selenium will be used to run a bot browser and load the site then scrap the data.

All the extracted data will be grouped in a table and saved in a local CSV file.

  • Extracted info from the search result pages:
    • Job title
    • Company
    • Post link
    • Job location
    • Contract type
    • Skills required
  • Extracted info from inside each post page:
    • Salary

Note: To run selenium, an exe for the firefox driver (GeckoDriver) is needed