Web Scraping Tool

This Python script is designed to scrape product data from a specific website and store it in a CSV file. It utilizes web scraping techniques with libraries like requests, BeautifulSoup, and csv. The script extracts information such as product names, details, SKUs, and image URLs from the website's HTML pages.

Prerequisites

Python 3.x
Required Python libraries: requests, BeautifulSoup

Usage

Make sure you have Python installed on your system.
Install the required Python libraries using pip:

   pip install requests beautifulsoup4

Clone or download the script to your local machine.
Run the script by executing the following command in your terminal:

   python script_name.py

Follow the prompts to input the necessary information.

Features

get_search_results(url): Retrieves search results from the specified URL.
extract_product_links(url): Extracts product links from the HTML code of the search results page.
extract_product_data(url): Extracts product data from the HTML code of the product page.
count_duplicates(img_urls, name, Handle): Counts duplicate image URLs.
write_to_csv(products): Writes product data to a CSV file.
main(): Main function to execute the script.
read_lines_from_file(filename): Reads lines from a file and returns a list of strings.
letsgo(SPU): Function to initiate the scraping process for a given SPU (Stock Keeping Unit).
write_to_csv_NotFound(SPU): Writes SPUs for which product links are not found to a separate CSV file.

Example

You can run the script with your desired input file containing SPUs to scrape product data. Ensure that the input file contains one SPU per line.

For detailed usage instructions and customization options, refer to the comments within the script.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.venv		.venv
kamelfo		kamelfo
reforma		reforma
.gitignore		.gitignore
README.MD		README.MD
RefCOSMESho.ipynb		RefCOSMESho.ipynb
RefSho.ipynb		RefSho.ipynb
main.ipynb		main.ipynb
passione.ipynb		passione.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraping Tool

Prerequisites

Usage

Features

Example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Web Scraping Tool

Prerequisites

Usage

Features

Example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages