You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Was wondering if it was possible to use this as a website specific search, in place of the "powered by google" search you often see. If so what would the process of setting this up look like? I did try to look into it, but i'm not sure how to setup the crawler and stuff to crawl specific website(s)
The text was updated successfully, but these errors were encountered:
Stracts crawler can't be limited to specific sites, but the index is built from plain .warc files so other crawlers such as nutch and heritrix should also work. I don't have experience with them so I don't know if they can be limited to specific sites, but they might.
As far as I know the 'powered by google' actually just executes a search {query} site:{site} to google which would very much be possible to build on top of stracts api as well.
Is it possible to specify a website to start with for crawling? I don't necesssairly need to limit the index to just the site in question, but I would like to try to keep it relevant I found the documentation a bit hard to understand. Also is it possible to override the user agent the crawler uses?
Was wondering if it was possible to use this as a website specific search, in place of the "powered by google" search you often see. If so what would the process of setting this up look like? I did try to look into it, but i'm not sure how to setup the crawler and stuff to crawl specific website(s)
The text was updated successfully, but these errors were encountered: