ScrapeGraph MCP Server

A production-ready Model Context Protocol (MCP) server that provides seamless integration with the ScrapeGraph AI API. This server enables language models to leverage advanced AI-powered web scraping capabilities with enterprise-grade reliability.

Key Features

8 Powerful Tools: From simple markdown conversion to complex multi-page crawling and agentic workflows
AI-Powered Extraction: Intelligently extract structured data using natural language prompts
Multi-Page Crawling: SmartCrawler supports asynchronous crawling with configurable depth and page limits
Infinite Scroll Support: Handle dynamic content loading with configurable scroll counts
JavaScript Rendering: Full support for JavaScript-heavy websites
Flexible Output Formats: Get results as markdown, structured JSON, or custom schemas
Easy Integration: Works seamlessly with Claude Desktop, Cursor, and any MCP-compatible client
Enterprise-Ready: Robust error handling, timeout management, and production-tested reliability
Simple Deployment: One-command installation via Smithery or manual setup
Comprehensive Documentation: Detailed developer docs in .agent/ folder

Quick Start

1. Get Your API Key

Sign up and get your API key from the ScrapeGraph Dashboard

2. Install with Smithery (Recommended)

npx -y @smithery/cli install @ScrapeGraphAI/scrapegraph-mcp --client claude

3. Start Using

Ask Claude or Cursor:

"Convert https://scrapegraphai.com to markdown"
"Extract all product prices from this e-commerce page"
"Research the latest AI developments and summarize findings"

That's it! The server is now available to your AI assistant.

Available Tools

The server provides 8 enterprise-ready tools for AI-powered web scraping:

Core Scraping Tools

1. `markdownify`

Transform any webpage into clean, structured markdown format.

markdownify(website_url: str)

Credits: 2 per request
Use case: Quick webpage content extraction in markdown

2. `smartscraper`

Leverage AI to extract structured data from any webpage with support for infinite scrolling.

smartscraper(
    user_prompt: str,
    website_url: str,
    number_of_scrolls: int = None,
    markdown_only: bool = None
)

Credits: 10+ (base) + variable based on scrolling
Use case: AI-powered data extraction with custom prompts

3. `searchscraper`

Execute AI-powered web searches with structured, actionable results.

searchscraper(
    user_prompt: str,
    num_results: int = None,
    number_of_scrolls: int = None
)

Credits: Variable (3-20 websites × 10 credits)
Use case: Multi-source research and data aggregation

Advanced Scraping Tools

4. `scrape`

Basic scraping endpoint to fetch page content with optional heavy JavaScript rendering.

scrape(website_url: str, render_heavy_js: bool = None)

Use case: Simple page content fetching with JS rendering support

5. `sitemap`

Extract sitemap URLs and structure for any website.

sitemap(website_url: str)

Use case: Website structure analysis and URL discovery

Multi-Page Crawling

6. `smartcrawler_initiate`

Initiate intelligent multi-page web crawling (asynchronous operation).

smartcrawler_initiate(
    url: str,
    prompt: str = None,
    extraction_mode: str = "ai",
    depth: int = None,
    max_pages: int = None,
    same_domain_only: bool = None
)

AI Extraction Mode: 10 credits per page - extracts structured data
Markdown Mode: 2 credits per page - converts to markdown
Returns: request_id for polling
Use case: Large-scale website crawling and data extraction

7. `smartcrawler_fetch_results`

Retrieve results from asynchronous crawling operations.

smartcrawler_fetch_results(request_id: str)

Returns: Status and results when crawling is complete
Use case: Poll for crawl completion and retrieve results

Intelligent Agent-Based Scraping

8. `agentic_scrapper`

Run advanced agentic scraping workflows with customizable steps and structured output schemas.

agentic_scrapper(
    url: str,
    user_prompt: str = None,
    output_schema: dict = None,
    steps: list = None,
    ai_extraction: bool = None,
    persistent_session: bool = None,
    timeout_seconds: float = None
)

Use case: Complex multi-step workflows with custom schemas and persistent sessions

Setup Instructions

To utilize this server, you'll need a ScrapeGraph API key. Follow these steps to obtain one:

Navigate to the ScrapeGraph Dashboard
Create an account and generate your API key

Automated Installation via Smithery

For automated installation of the ScrapeGraph API Integration Server using Smithery:

npx -y @smithery/cli install @ScrapeGraphAI/scrapegraph-mcp --client claude

Claude Desktop Configuration

Update your Claude Desktop configuration file with the following settings (located on the top rigth of the Cursor page):

(remember to add your API key inside the config)

{
    "mcpServers": {
        "@ScrapeGraphAI-scrapegraph-mcp": {
            "command": "npx",
            "args": [
                "-y",
                "@smithery/cli@latest",
                "run",
                "@ScrapeGraphAI/scrapegraph-mcp",
                "--config",
                "\"{\\\"scrapegraphApiKey\\\":\\\"YOUR-SGAI-API-KEY\\\"}\""
            ]
        }
    }
}

The configuration file is located at:

Windows: %APPDATA%/Claude/claude_desktop_config.json
macOS: ~/Library/Application\ Support/Claude/claude_desktop_config.json

Cursor Integration

Add the ScrapeGraphAI MCP server on the settings:

Example Use Cases

The server enables sophisticated queries across various scraping scenarios:

Single Page Scraping

Markdownify: "Convert the ScrapeGraph documentation page to markdown"
SmartScraper: "Extract all product names, prices, and ratings from this e-commerce page"
SmartScraper with scrolling: "Scrape this infinite scroll page with 5 scrolls and extract all items"
Basic Scrape: "Fetch the HTML content of this JavaScript-heavy page with full rendering"

Search and Research

SearchScraper: "Research and summarize recent developments in AI-powered web scraping"
SearchScraper: "Search for the top 5 articles about machine learning frameworks and extract key insights"
SearchScraper: "Find recent news about GPT-4 and provide a structured summary"

Website Analysis

Sitemap: "Extract the complete sitemap structure from the ScrapeGraph website"
Sitemap: "Discover all URLs on this blog site"

Multi-Page Crawling

SmartCrawler (AI mode): "Crawl the entire documentation site and extract all API endpoints with descriptions"
SmartCrawler (Markdown mode): "Convert all pages in the blog to markdown up to 2 levels deep"
SmartCrawler: "Extract all product information from an e-commerce site, maximum 100 pages, same domain only"

Advanced Agentic Scraping

Agentic Scraper: "Navigate through a multi-step authentication form and extract user dashboard data"
Agentic Scraper with schema: "Follow pagination links and compile a dataset with schema: {title, author, date, content}"
Agentic Scraper: "Execute a complex workflow: login, navigate to reports, download data, and extract summary statistics"

Error Handling

The server implements robust error handling with detailed, actionable error messages for:

API authentication issues
Malformed URL structures
Network connectivity failures
Rate limiting and quota management

Common Issues

Windows-Specific Connection

When running on Windows systems, you may need to use the following command to connect to the MCP server:

C:\Windows\System32\cmd.exe /c npx -y @smithery/cli@latest run @ScrapeGraphAI/scrapegraph-mcp --config "{\"scrapegraphApiKey\":\"YOUR-SGAI-API-KEY\"}"

This ensures proper execution in the Windows environment.

Other Common Issues

"ScrapeGraph client not initialized"

Cause: Missing API key
Solution: Set SGAI_API_KEY environment variable or provide via --config

"Error 401: Unauthorized"

Cause: Invalid API key
Solution: Verify your API key at the ScrapeGraph Dashboard

"Error 402: Payment Required"

Cause: Insufficient credits
Solution: Add credits to your ScrapeGraph account

SmartCrawler not returning results

Cause: Still processing (asynchronous operation)
Solution: Keep polling smartcrawler_fetch_results() until status is "completed"

Tools not appearing in Claude Desktop

Cause: Server not starting or configuration error
Solution: Check Claude logs at ~/Library/Logs/Claude/ (macOS) or %APPDATA%\Claude\Logs\ (Windows)

For detailed troubleshooting, see the .agent documentation.

Development

Prerequisites

Python 3.10 or higher
pip or uv package manager
ScrapeGraph API key

Installation from Source

# Clone the repository
git clone https://github.com/ScrapeGraphAI/scrapegraph-mcp
cd scrapegraph-mcp

# Install dependencies
pip install -e ".[dev]"

# Set your API key
export SGAI_API_KEY=your-api-key

# Run the server
scrapegraph-mcp
# or
python -m scrapegraph_mcp.server

Testing with MCP Inspector

Test your server locally using the MCP Inspector tool:

npx @modelcontextprotocol/inspector scrapegraph-mcp

This provides a web interface to test all available tools.

Code Quality

Linting:

ruff check src/

Type Checking:

mypy src/

Format Checking:

ruff format --check src/

Project Structure

scrapegraph-mcp/
├── src/
│   └── scrapegraph_mcp/
│       ├── __init__.py      # Package initialization
│       └── server.py        # Main MCP server (all code in one file)
├── .agent/                  # Developer documentation
│   ├── README.md           # Documentation index
│   └── system/             # System architecture docs
├── assets/                  # Images and badges
├── pyproject.toml          # Project metadata & dependencies
├── smithery.yaml           # Smithery deployment config
└── README.md               # This file

Contributing

We welcome contributions! Here's how you can help:

Adding a New Tool

Add method to ScapeGraphClient class in server.py:

def new_tool(self, param: str) -> Dict[str, Any]:
    """Tool description."""
    url = f"{self.BASE_URL}/new-endpoint"
    data = {"param": param}
    response = self.client.post(url, headers=self.headers, json=data)
    if response.status_code != 200:
        raise Exception(f"Error {response.status_code}: {response.text}")
    return response.json()

Add MCP tool decorator:

@mcp.tool()
def new_tool(param: str) -> Dict[str, Any]:
    """
    Tool description for AI assistants.

    Args:
        param: Parameter description

    Returns:
        Dictionary containing results
    """
    if scrapegraph_client is None:
        return {"error": "ScrapeGraph client not initialized. Please provide an API key."}

    try:
        return scrapegraph_client.new_tool(param)
    except Exception as e:
        return {"error": str(e)}

Test with MCP Inspector:

npx @modelcontextprotocol/inspector scrapegraph-mcp

Update documentation:
- Add tool to this README
- Update .agent documentation
Submit a pull request

Development Workflow

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes
Run linting and type checking
Test with MCP Inspector and Claude Desktop
Update documentation
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Code Style

Line length: 100 characters
Type hints: Required for all functions
Docstrings: Google-style docstrings
Error handling: Return error dicts, don't raise exceptions in tools
Python version: Target 3.10+

For detailed development guidelines, see the .agent documentation.

Documentation

For comprehensive developer documentation, see:

.agent/README.md - Complete developer documentation index
.agent/system/project_architecture.md - System architecture and design
.agent/system/mcp_protocol.md - MCP protocol integration details

Technology Stack

Core Framework

Python 3.10+ - Modern Python with type hints
FastMCP - Lightweight MCP server framework
httpx 0.24.0+ - Modern async HTTP client

Development Tools

Ruff - Fast Python linter and formatter
mypy - Static type checker
Hatchling - Modern build backend

Deployment

Smithery - Automated MCP server deployment
Docker - Container support with Alpine Linux
stdio transport - Standard MCP communication

API Integration

ScrapeGraph AI API - Enterprise web scraping service
Base URL: https://api.scrapegraphai.com/v1
Authentication: API key-based

License

This project is distributed under the MIT License. For detailed terms and conditions, please refer to the LICENSE file.

Acknowledgments

Special thanks to tomekkorbak for his implementation of oura-mcp-server, which served as starting point for this repo.

Resources

Official Links

ScrapeGraph AI Homepage
ScrapeGraph Dashboard - Get your API key
ScrapeGraph API Documentation
GitHub Repository

MCP Resources

Model Context Protocol - Official MCP specification
FastMCP Framework - Framework used by this server
MCP Inspector - Testing tool
Smithery - MCP server distribution

AI Assistant Integration

Claude Desktop - Desktop app with MCP support
Cursor - AI-powered code editor

Support

GitHub Issues - Report bugs or request features
Developer Documentation - Comprehensive dev docs

Made with ❤️ by ScrapeGraphAI Team

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.agent		.agent
.github/workflows		.github/workflows
assets		assets
src/scrapegraph_mcp		src/scrapegraph_mcp
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
smithery.yaml		smithery.yaml
uv.lock		uv.lock

Uh oh!

License

Uh oh!

ScrapeGraphAI/scrapegraph-mcp

Folders and files

Latest commit

History

Repository files navigation

ScrapeGraph MCP Server

Table of Contents

Key Features

Quick Start

1. Get Your API Key

2. Install with Smithery (Recommended)

3. Start Using

Available Tools

Core Scraping Tools

1. markdownify

2. smartscraper

3. searchscraper

Advanced Scraping Tools

4. scrape

5. sitemap

Multi-Page Crawling

6. smartcrawler_initiate

7. smartcrawler_fetch_results

Intelligent Agent-Based Scraping

8. agentic_scrapper

Setup Instructions

Automated Installation via Smithery

Claude Desktop Configuration

Cursor Integration

Example Use Cases

Single Page Scraping

Search and Research

Website Analysis

Multi-Page Crawling

Advanced Agentic Scraping

Error Handling

Common Issues

Windows-Specific Connection

Other Common Issues

Development

Prerequisites

Installation from Source

Testing with MCP Inspector

Code Quality

Project Structure

Contributing

Adding a New Tool

Development Workflow

Code Style

Documentation

Technology Stack

Core Framework

Development Tools

Deployment

API Integration

License

Acknowledgments

Resources

Official Links

MCP Resources

AI Assistant Integration

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Uh oh!

Languages

1. `markdownify`

2. `smartscraper`

3. `searchscraper`

4. `scrape`

5. `sitemap`

6. `smartcrawler_initiate`

7. `smartcrawler_fetch_results`

8. `agentic_scrapper`

Packages