Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh the database #12

Open
jaraco opened this issue Jan 12, 2025 · 1 comment
Open

Refresh the database #12

jaraco opened this issue Jan 12, 2025 · 1 comment

Comments

@jaraco
Copy link
Member

jaraco commented Jan 12, 2025

When I created the original proof of concept (coherent-oss/coherent.build#3), I learned that the top 8k projects was inadequate (in particular because of newly-uploaded projects) and how to download all 800k projects. I should run that again to make sure the database is up-to-date.

@jaraco
Copy link
Member Author

jaraco commented Jan 12, 2025

I once again downloaded the dataset using:

 🐚 pipx run --python 3.13 pypinfo --json --indent 0 --limit 800000 --days 30 "" project > ~/Downloads/top-pypi-packages-30-days.json

And then using the latest coherent.deps (63954d8), processed them:

 🐚 py -3.13 -m pip-run coherent.deps -- -m coherent.deps.distributions.load ~/Downloads/top-pypi-packages-30-days.json
100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 504039/504039 [2:53:00<00:00, 48.56it/s]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant