Rightmove Property Scraper

A web scraping project developed with Scrapy to extract property data from the Rightmove website. This project collects detailed information on properties for sale in London, including addresses, property types, bedrooms, transactions, and geographical coordinates.

Description

This project uses Scrapy to scrape property data from the Rightmove website. It handles pagination, scraping up to the third page of results. The collected data is structured and can be used for real estate market analysis, price forecasting, or other property-related applications.

Key Features:

Data Extraction: Collects property data such as address, property type, number of bedrooms, transactions, and geolocation.
Pagination: Handles pagination, automatically navigating to the next page to scrape more data (up to the third page).
JSON Parsing: Uses JSON parsing for dynamically loaded data on the page.
Error Handling: Logs parsing errors and pages without properties.

Prerequisites

Before running the project, ensure that you have the following dependencies installed:

Python 3.6 or higher
Scrapy (version 2.5.0 or higher)

Dependencies

You can install the required dependencies using pip:

pip install -r requirements.txt

Installation

1. Clone the repository

Clone the project to your local machine using Git:

git clone https://github.com/marcellin-d/Rightmove-Property-Scraper.git
cd rightmove-scraper

2. Create a virtual environment (optional but recommended)

To manage dependencies, it is highly recommended to create a virtual environment:

python -m venv venv

Activate the virtual environment:

On Windows:

venv\Scripts\activate

On macOS/Linux:

source venv/bin/activate

3. Install dependencies

Install the required dependencies via pip:

pip install -r requirements.txt

4. Set up your Scrapy project

Ensure that your Scrapy project is set up correctly. The scrapy.cfg file and other necessary files should be present in the project directory.

Usage

To run the spider and collect property data, use the following command:

scrapy crawl rightmoveSpider

This will start the scraping process and save the results in a CSV, JSON, or other formats depending on your configuration in settings.py.

Parameters

Pages: The spider collects data from the first to the third page of Rightmove search results.
Output Format: By default, results are saved in JSON format.

You can modify these parameters as needed in the settings.py file.

Project Structure

Here’s the basic structure of the Scrapy project:

rightmove-scraper/
├── rightmoveSpider.py            # Main spider to scrape data
├── scrapy.cfg                    # Scrapy project configuration
├── items.py                      # Item definitions
├── pipelines.py                  # Pipelines for data processing
├── settings.py                   # Scrapy settings configuration
├── requirements.txt              # List of dependencies
└── README.md                     # This README file

File Details:

rightmoveSpider.py: Contains the spider responsible for data extraction and pagination.
items.py: Defines the fields for the extracted data (e.g., address, property type, etc.).
settings.py: Contains the project configuration for Scrapy, including user-agent settings, pipelines, and other parameters.

📫 Contact

For questions or suggestions, feel free to reach out:

Name: Marcellin DJAMBO
Email: [email protected]
LinkedIn: My LinkedIn Profile

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
rightmove		rightmove
Capture d'écran 2024-11-25 042933.png		Capture d'écran 2024-11-25 042933.png
README.md		README.md
requirements.txt		requirements.txt
rightmove.csv		rightmove.csv
scrapinghub.yml		scrapinghub.yml
scrapy.cfg		scrapy.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rightmove Property Scraper

Table of Contents

Description

Key Features:

Prerequisites

Dependencies

Installation

1. Clone the repository

2. Create a virtual environment (optional but recommended)

3. Install dependencies

4. Set up your Scrapy project

Usage

Parameters

Project Structure

File Details:

📫 Contact

About

Releases

Packages

Languages

marcellin-d/Rightmove-Property-Scraper

Folders and files

Latest commit

History

Repository files navigation

Rightmove Property Scraper

Table of Contents

Description

Key Features:

Prerequisites

Dependencies

Installation

1. Clone the repository

2. Create a virtual environment (optional but recommended)

3. Install dependencies

4. Set up your Scrapy project

Usage

Parameters

Project Structure

File Details:

📫 Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages