Web Crawler Project

Project Description

This Web Crawler is designed to efficiently navigate and retrieve data from the web. It starts from a specified seed URL and fetches a list of all reachable URLs. Key features include asynchronous processing, customizable depth and caching options.

Features

Seed URL Processing: Begins crawling from a user-specified seed URL.
Asynchronous Processing: Enhances efficiency and speed.
Customization Options:
- Depth Specification: Allows users to define the depth of the crawl.
- Caching System: Includes a caching mechanism for storing and retrieving URLs.
- Crawl Mode: Option to use cached data or initiate a new crawl.

Installation

# Instructions for setting up the project environment and installing dependencies

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.mvn/wrapper		.mvn/wrapper
src		src
.gitignore		.gitignore
README.md		README.md
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Crawler Project

Project Description

Features

Installation

About

Releases

Packages

Languages

harshitt07/web-crawler

Folders and files

Latest commit

History

Repository files navigation

Web Crawler Project

Project Description

Features

Installation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages