Name	Name	Last commit message	Last commit date
parent directory ..
.parameters/api	.parameters/api
api	api
collector	collector
intuned-resources/jobs	intuned-resources/jobs
utils	utils
.env.example	.env.example
.gitignore	.gitignore
Intuned.jsonc	Intuned.jsonc
README.md	README.md
pyproject.toml	pyproject.toml

Name

Last commit message

Last commit date

.parameters/api

api

collector

intuned-resources/jobs

Scrapy (Python)

Web scraping examples using Scrapy framework.

Run on Intuned

APIs

API	Description
`scrapy-crawler`	Scrapes static websites using Scrapy's built-in HTTP request system and CSS/XPath selectors
`scrapy-crawler-js`	Renders JavaScript-heavy pages with Playwright, then parses the HTML output using Scrapy

Getting started

Install dependencies

uv sync

If the intuned CLI is not installed, install it globally:

npm install -g @intuned/cli

After installing dependencies, intuned command should be available in your environment.

Run an API

intuned dev run api scrapy-crawler .parameters/api/scrapy-crawler/default.json
intuned dev run api scrapy-crawler-js .parameters/api/scrapy-crawler-js/default.json

Save project

intuned dev provision

Deploy

intuned dev deploy

Project structure

/
├── api/
│   ├── scrapy-crawler.py     # Scrapy crawler using Scrapy's HTTP requests
│   └── scrapy-crawler-js.py  # Scrapy crawler using Playwright + Scrapy parsing
├── collector/
│   └── item_collector.py     # Collects scraped items via Scrapy signals
├── utils/
│   └── types_and_schemas.py  # Pydantic models for parameters and data
├── intuned-resources/
│   └── jobs/
│       ├── scrapy-crawler.job.jsonc    # Job for static site crawling
│       └── scrapy-crawler-js.job.jsonc # Job for JS-rendered crawling
├── .parameters/api/          # Test parameters
├── Intuned.jsonc             # Project config
├── pyproject.toml            # Python dependencies
└── README.md

Key features

scrapy-crawler: Best for static websites — uses Scrapy's CrawlerRunner for HTTP requests, CSS selectors, and pagination
scrapy-crawler-js: Best for JavaScript-heavy sites — uses Playwright to render pages before Scrapy parses the HTML

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Scrapy (Python)

Run on Intuned

APIs

Getting started

Install dependencies

Run an API

Save project

Deploy

Project structure

Key features

Related

FilesExpand file tree

scrapy

Directory actions

More options

Directory actions

More options

Latest commit

History

scrapy

Folders and files

parent directory

README.md

Scrapy (Python)

Run on Intuned

APIs

Getting started

Install dependencies

Run an API

Save project

Deploy

Project structure

Key features

Related