Skip to content

Metadata crawler improvments #665

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Andrei-Dolgolev
Copy link
Contributor

@Andrei-Dolgolev Andrei-Dolgolev commented Sep 12, 2022

Changes

How to test these changes?

 metadata-crawler crawl -b polygon

Related issues

@Andrei-Dolgolev Andrei-Dolgolev changed the title State crawler improvments Metadata crawler improvments Sep 14, 2022
Copy link
Contributor

@kompotkot kompotkot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments

@@ -38,8 +44,14 @@ def crawl_uri(metadata_uri: str) -> Any:
result = None
while retry < 3:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sugest to re-write this func to avoid increment of retry var only in exceptions. But would do this by default at the end of loop and if response.status == 200: then write result and break the while loop.
But if this version works well, then probably there are no need any changes.

already_parsed = get_current_metadata_for_address(
db_session=db_session, blockchain_type=blockchain_type, address=address
logger.info(
f"Start crawling {len(not_updated_tokens)} tokens of address {address}"
)

for requests_chunk in [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to read, probably better to add another variable on top of this for loop.


if token_uri_data.token_id not in already_parsed:
metadata = crawl_uri(token_uri_data.token_uri)
with ThreadPoolExecutor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this threads which inside another loop from list comprehension range will be under better control))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants