Skip to content

Users hoarding package names #724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nikochiko opened this issue Nov 7, 2020 · 6 comments
Closed

Users hoarding package names #724

nikochiko opened this issue Nov 7, 2020 · 6 comments
Labels
mass name squat Report a mass name squatting by a user of PyPI

Comments

@nikochiko
Copy link

I came across this user (https://pypi.org/user/Collie/) who is hoarding package names without publishing any content under those packages. This includes common words which are very likely candidates for new and real projects. (including "author", "maintainer", "apps", "filter", etc.). It's pretty clear they do not mean to publish any real code (there are 59 empty projects).
This should probably be against the code of conduct. Perhaps adding a "Report User" or "Report Package" feature (I see pypi/warehouse#3896), or another way to discourage that.

@di
Copy link
Member

di commented Nov 9, 2020

These would be considered "Invalid Projects" per PEP 541.

Transferring to pypa/pypi-support to be addressed.

@di di transferred this issue from pypi/warehouse Nov 9, 2020
@pradyunsg pradyunsg added the mass name squat Report a mass name squatting by a user of PyPI label Nov 15, 2021
@MartinThoma
Copy link

For that user : #791 - it's still an issue, over 1 year after the user was reported.

@MartinThoma
Copy link

Would it be possible / desirable to limit the number of packages a user can create on PyPI?

Current State on PyPI

There are about 125,000 authors of Python packages. Please note that 15,682 packages left the "author" field empty, meaning there could be a lot more.

  • 75% of authors have less than 2 packages.
  • 95% of authors have less than 5 packages.
  • 99% of authors have less than 14 packages.
  • There are 129 authors with 50 or more packages. This includes cooperations like Microsoft / Google (1, 2, 3) / AWS, organizations like micropython and single developers like Paul Sokolovsky (1 - many more by other users)

For maintainers it looks similar:

There are 6,072 maintainers of Python packages.

  • 75% of maintainers have less than 2 packages.
  • 95% of maintainers have less than 4 packages.
  • 99% of maintainers have less than 10 packages.
  • Many packages are maintained by nobody (224,381 blank and 46,540 "none"), Mohsen Banan, Mark Veltzer, and Vanessa Sochat

Suggestion

In order to prevent people from creating a lot of users it would be possible to use a proof-of-work mechanism that makes it "expensive", e.g. the pypi.org server could ask the client to solve a challenge before the user is created. Something like this:

from hashlib import sha256

def int_to_bytes(x: int) -> bytes:
    return x.to_bytes((x.bit_length() + 7) // 8, 'big')

def solve_challenge(prefix_challenge: bytes) -> bytes:
    """Find a hash that starts with the given prefix."""
    for i in range(2**32):
        unhashed = prefix_challenge + int_to_bytes(i)
        h = sha256(unhashed).digest()
        if h.startswith(prefix_challenge):
            return unhashed
    raise ValueError("Could not find hash starting with prefix")

print(solve_challenge(b"moos"))

Once that works, every user could initially be limited to 5 packages.

Users can reach a "power user" status with which they can upload more packages (e.g. up to 50). This power user status could be reached when they have created 5 packages, uploaded content to those, and get "endorsed" by two power users. The initial set of power users could be the authors of the top 50 packages.

To be able to upload more than 50 packages, they would need to be declared an "organization" account.

@MartinThoma
Copy link

MartinThoma commented Jan 27, 2022

I have a couple of other packages which look line name hoarding / name blocking:

deleted-user: 108 packages

  • 11 Packages with the description "example package", including rapidjson which could easily be confused with python-rapidjson
  • https://pypi.org/project/Dogecoin : Registered in 2018, one empty upload (563 bytes)
  • 12 packages with the description "snellejelle & preben", including pyethereum
  • ChainLink, ZCoin, MaidSafeCoin, MonaCoin, SysCoin, DentaCoin, KuCoin, Siacoin, Bytecoin, Litecoin
  • h3py
  • ios

@di
Copy link
Member

di commented Jan 27, 2022

@MartinThoma FYI, deleted-user is a PyPI-owned account. We used to transfer projects to that user when their original owner deleted their account.

@di
Copy link
Member

di commented Oct 12, 2022

Invalid namesquatting packages have been removed.

@di di closed this as completed Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mass name squat Report a mass name squatting by a user of PyPI
Projects
None yet
Development

No branches or pull requests

4 participants