Added Bing Search Engine, 10X Speedup, Cleaner HTML. Made architectural changes requested by JulesGM #8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@JulesGM I have implemented all the architectural and stylistic suggestions you requested. This new pull request adds Bing Search since that was what was used in the ParlAI Blenderbot2 paper. It also allows you to limit the the text per URL since currently Blenderbot only uses the first 512 characters. It allows you to strip out HTML menus. You can also return a clean summary of each web page at 10X faster since it does not need to fetch each URL. I have updated the README with examples to allow you to quickly test these options. Overall it enables the search engine to return significantly higher quality text to Blenderbot2. I will send you a separate private email with the URLs to each of these test URLs, which I have deployed as Docker Containers to Google Cloud in case you do not have a Bing Search Subscription key and want to test them. Thank you again for your time.