Skip to content

Latest commit

Β 

History

History
210 lines (147 loc) Β· 5.35 KB

README.md

File metadata and controls

210 lines (147 loc) Β· 5.35 KB

🌐 ScrapeGraph Python SDK

PyPI version Python Support License Code style: black Documentation Status

ScrapeGraph API Banner

Official Python SDK for the ScrapeGraph API - Smart web scraping powered by AI.

πŸ“¦ Installation

pip install scrapegraph-py

πŸš€ Features

  • πŸ€– AI-powered web scraping and search
  • πŸ”„ Both sync and async clients
  • πŸ“Š Structured output with Pydantic schemas
  • πŸ” Detailed logging
  • ⚑ Automatic retries
  • πŸ” Secure authentication

🎯 Quick Start

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

Note

You can set the SGAI_API_KEY environment variable and initialize the client without parameters: client = Client()

πŸ“š Available Endpoints

πŸ€– SmartScraper

Extract structured data from any webpage or HTML content using AI.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

# Using a URL
response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the main heading and description"
)

# Or using HTML content
html_content = """
<html>
    <body>
        <h1>Company Name</h1>
        <p>We are a technology company focused on AI solutions.</p>
    </body>
</html>
"""

response = client.smartscraper(
    website_html=html_content,
    user_prompt="Extract the company description"
)

print(response)
Output Schema (Optional)
from pydantic import BaseModel, Field
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

class WebsiteData(BaseModel):
    title: str = Field(description="The page title")
    description: str = Field(description="The meta description")

response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the title and description",
    output_schema=WebsiteData
)

πŸ” SearchScraper

Perform AI-powered web searches with structured results and reference URLs.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

response = client.searchscraper(
    user_prompt="What is the latest version of Python and its main features?"
)

print(f"Answer: {response['result']}")
print(f"Sources: {response['reference_urls']}")
Output Schema (Optional)
from pydantic import BaseModel, Field
from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

class PythonVersionInfo(BaseModel):
    version: str = Field(description="The latest Python version number")
    release_date: str = Field(description="When this version was released")
    major_features: list[str] = Field(description="List of main features")

response = client.searchscraper(
    user_prompt="What is the latest version of Python and its main features?",
    output_schema=PythonVersionInfo
)

πŸ“ Markdownify

Converts any webpage into clean, formatted markdown.

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

response = client.markdownify(
    website_url="https://example.com"
)

print(response)

⚑ Async Support

All endpoints support async operations:

import asyncio
from scrapegraph_py import AsyncClient

async def main():
    async with AsyncClient() as client:
        response = await client.smartscraper(
            website_url="https://example.com",
            user_prompt="Extract the main content"
        )
        print(response)

asyncio.run(main())

πŸ“– Documentation

For detailed documentation, visit docs.scrapegraphai.com

πŸ› οΈ Development

For information about setting up the development environment and contributing to the project, see our Contributing Guide.

πŸ’¬ Support & Feedback

  • πŸ“§ Email: [email protected]
  • πŸ’» GitHub Issues: Create an issue
  • 🌟 Feature Requests: Request a feature
  • ⭐ API Feedback: You can also submit feedback programmatically using the feedback endpoint:
    from scrapegraph_py import Client
    
    client = Client(api_key="your-api-key-here")
    
    client.submit_feedback(
        request_id="your-request-id",
        rating=5,
        feedback_text="Great results!"
    )

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ”— Links


Made with ❀️ by ScrapeGraph AI