-
Notifications
You must be signed in to change notification settings - Fork 94
Open
Labels
bugSomething isn't workingSomething isn't working
Description
I installed the standalone server via Docker and then queued https://jsonforms.io/docs/ for scraping
It identified 24 pages and then successfully scraped 19 of them and failed on the remaining 5 and then claimed "success".
I feel like there should be an option to complete the job.
Also.. it would be nice to be able to include a URL blacklist. For example I am only interested in the "vue" documentation but the website above also provides angular and react and I don't want to confuse the AI with documentation that doesn't apply.
This was the result:
📝 Job enqueued: dc350e83-79a8-4844-982d-3a9d7e02e5ab for [email protected]
🗑️ Removing all documents from [email protected] store
🗑️ Deleted 0 documents
💾 Cleared store for [email protected] before scraping.
🌐 Scraping page 1/1 (depth 0/3): https://jsonforms.io/docs/
📚 Adding document: What is JSON Forms? - JSON Forms
✂️ Split document into 4 chunks
🌐 Scraping page 2/24 (depth 1/3): https://jsonforms.io/docs/uischema/
📚 Adding document: UI Schema - JSON Forms
✂️ Split document into 1 chunks
🌐 Scraping page 3/24 (depth 1/3): https://jsonforms.io/docs/architecture
📚 Adding document: Architecture - JSON Forms
✂️ Split document into 2 chunks
🌐 Scraping page 4/24 (depth 1/3): https://jsonforms.io/docs/getting-started
📚 Adding document: Getting Started - JSON Forms
✂️ Split document into 1 chunks
🌐 Scraping page 5/24 (depth 1/3): https://jsonforms.io/docs/uischema/layouts
📚 Adding document: Layouts - JSON Forms
✂️ Split document into 4 chunks
🌐 Scraping page 6/24 (depth 1/3): https://jsonforms.io/docs/uischema/rules
📚 Adding document: Rules - JSON Forms
✂️ Split document into 3 chunks
🌐 Scraping page 7/24 (depth 1/3): https://jsonforms.io/docs/uischema/controls
📚 Adding document: Controls - JSON Forms
✂️ Split document into 9 chunks
🌐 Scraping page 8/24 (depth 1/3): https://jsonforms.io/docs/renderer-sets
📚 Adding document: Renderer sets - JSON Forms
✂️ Split document into 3 chunks
🌐 Scraping page 9/24 (depth 1/3): https://jsonforms.io/docs/labels
📚 Adding document: Labels - JSON Forms
✂️ Split document into 3 chunks
🌐 Scraping page 10/24 (depth 1/3): https://jsonforms.io/docs/i18n
📚 Adding document: i18n - JSON Forms
✂️ Split document into 8 chunks
🌐 Scraping page 11/24 (depth 1/3): https://jsonforms.io/docs/ref-resolving
📚 Adding document: Ref Resolving - JSON Forms
✂️ Split document into 2 chunks
🌐 Scraping page 12/24 (depth 1/3): https://jsonforms.io/docs/validation
📚 Adding document: Validation - JSON Forms
✂️ Split document into 3 chunks
🌐 Scraping page 13/24 (depth 1/3): https://jsonforms.io/docs/readonly
📚 Adding document: ReadOnly - JSON Forms
✂️ Split document into 4 chunks
🌐 Scraping page 14/24 (depth 1/3): https://jsonforms.io/docs/middleware
📚 Adding document: Middleware - JSON Forms
✂️ Split document into 5 chunks
🌐 Scraping page 15/24 (depth 1/3): https://jsonforms.io/docs/multiple-choice
📚 Adding document: Multiple Choice - JSON Forms
✂️ Split document into 5 chunks
🌐 Scraping page 16/24 (depth 1/3): https://jsonforms.io/docs/date-time-picker
📚 Adding document: Date and Time Picker - JSON Forms
✂️ Split document into 8 chunks
🌐 Scraping page 17/24 (depth 1/3): https://jsonforms.io/docs/tutorial
📚 Adding document: Create a JSON Forms App - JSON Forms
✂️ Split document into 5 chunks
🌐 Scraping page 18/24 (depth 1/3): https://jsonforms.io/docs/tutorial/custom-renderers
📚 Adding document: Custom Renderers - JSON Forms
✂️ Split document into 11 chunks
🌐 Scraping page 19/24 (depth 1/3): https://jsonforms.io/docs/tutorial/custom-layouts
📚 Adding document: Custom Layouts - JSON Forms
✂️ Split document into 7 chunks
❌ Failed processing page https://jsonforms.io/docs/tutorial/multiple-forms: ScraperError: Failed to fetch https://jsonforms.io/docs/tutorial/multiple-forms after 1 attempts: Request failed with status code 403
❌ Failed to process https://jsonforms.io/docs/tutorial/multiple-forms: ScraperError: Failed to fetch https://jsonforms.io/docs/tutorial/multiple-forms after 1 attempts: Request failed with status code 403
❌ Failed processing page https://jsonforms.io/docs/api: ScraperError: Failed to fetch https://jsonforms.io/docs/api after 1 attempts: Request failed with status code 403
❌ Failed to process https://jsonforms.io/docs/api: ScraperError: Failed to fetch https://jsonforms.io/docs/api after 1 attempts: Request failed with status code 403
❌ Failed processing page https://jsonforms.io/docs/integrations/react: ScraperError: Failed to fetch https://jsonforms.io/docs/integrations/react after 1 attempts: Request failed with status code 403
❌ Failed to process https://jsonforms.io/docs/integrations/react: ScraperError: Failed to fetch https://jsonforms.io/docs/integrations/react after 1 attempts: Request failed with status code 403
❌ Failed processing page https://jsonforms.io/docs/integrations/vue: ScraperError: Failed to fetch https://jsonforms.io/docs/integrations/vue after 1 attempts: Request failed with status code 403
❌ Failed to process https://jsonforms.io/docs/integrations/vue: ScraperError: Failed to fetch https://jsonforms.io/docs/integrations/vue after 1 attempts: Request failed with status code 403
❌ Failed processing page https://jsonforms.io/docs/integrations/angular: ScraperError: Failed to fetch https://jsonforms.io/docs/integrations/angular after 1 attempts: Request failed with status code 403
❌ Failed to process https://jsonforms.io/docs/integrations/angular: ScraperError: Failed to fetch https://jsonforms.io/docs/integrations/angular after 1 attempts: Request failed with status code 403
✅ Job completed: dc350e83-79a8-4844-982d-3a9d7e02e5ab
🔍 Searching jsonforms@latest for: vue
🔎 Validating existence of library: jsonforms
✅ Library 'jsonforms' confirmed to exist.
🔍 Finding best version for jsonforms
✅ Found best match version 3.6.0 for jsonforms
✅ Found 6 matching results
brolnickij
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working