-
Couldn't load subscription status.
- Fork 17
AutoExtractProvider now support the new scrapy-poet cache interface #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Additionaly, the preferred pageType for HTML requests (``AutoExtractProductData``) is now chosen always if listed as dependency instead of just choosing the first dependency ``pageType`` to request the HTML
Codecov Report
@@ Coverage Diff @@
## master #31 +/- ##
==========================================
+ Coverage 85.24% 85.82% +0.57%
==========================================
Files 9 9
Lines 488 515 +27
==========================================
+ Hits 416 442 +26
- Misses 72 73 +1
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job Ivan 👍
scrapy_autoextract/providers.py
Outdated
| _TASK_MANAGER = "_autoextract_task_manager" | ||
|
|
||
|
|
||
| AEDataType = TypeVar('AEDataType', bound=AutoExtractData, covariant=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate a little bit, please, why you created a covariant TypeVar here instead of just using AutoExtractData in typing? Is it because it could be product data, article data, and so on, so it won't be an invariant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's precisely right. Using covariant=True here would cover any subtypes derived from AutoExtractData.
setup.py
Outdated
| 'autoextract-poet>=0.3.0', | ||
| 'zyte-autoextract>=0.7.0', | ||
| 'scrapy-poet>=0.2.0', | ||
| 'scrapy-poet @ git+https://[email protected]/scrapinghub/scrapy-poet@injector_record_replay_native#egg=scrapy-poet', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to remove this after a new release of scrapy-poet to PyPI is done.
This requires this change from
scrapy-poet: scrapinghub/scrapy-poet#55Additionally, the preferred pageType for HTML requests (
AutoExtractProductData)is now chosen always if listed as dependency instead of just choosing
the first dependency
pageTypeto request the HTMLTodo:
scrapy-poetdependency to Pipy once released