Skip to content

Expansion Plan #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mdhaber opened this issue Dec 18, 2024 · 5 comments
Open

Expansion Plan #4

mdhaber opened this issue Dec 18, 2024 · 5 comments

Comments

@mdhaber
Copy link
Collaborator

mdhaber commented Dec 18, 2024

This issue is about how new distributions are to be added to the package.

Will contributors create a new file (e.g. src/skstats/normal.py), add their distribution to a standard test suite, then wait for review by a maintainer?

This would work, but I was hoping for there to be a way for contributors to act as maintainers for specific distributions and make improvements without our review as long as certain standards are met (e.g. generic tests pass).

For instance, an alternative model would be for each distribution to be its own little repo under the scikit-stats org with its own issue tracker, set of maintainers, etc. Perhaps the top level scikit-stats repo would include all the distributions that meet a certain set of standards as submodules or subprojects and be the package that gets released periodically.

What did you have in mind @tupui?

@tupui
Copy link
Collaborator

tupui commented Dec 19, 2024

I think could should come up with a set of automated tests which would simply go over all the single distribution files and run the same battery. The CI can check that people just add a single new distribution file and all can actually be automatic and merged without our intervention if green even. That's the whole point of it right? Being able to propose a space outside of SciPy for any fantasy distribution people want without bearing the cost of reviewing, etc.

That could still work with people expanding the namespace with an external package if we provide the correct guidelines on how to structure things. i.e. a template with e.g. Jinja (like many tools do, thinking about Alembic with migrations). But I would not go into the business of adding repos to the org, that would grow into a long list and need some maintenance and "supervision". We don't want to have to track what people are putting in there and being associated with what could happen in these repos if things go bad (and that's a certainty that they will.)

@mdhaber
Copy link
Collaborator Author

mdhaber commented Feb 13, 2025

That's the whole point of it right? Being able to propose a space outside of SciPy for any fantasy distribution people want without bearing the cost of reviewing, etc.

Mostly, but I did not envision a total free for all. The main amendments I'd make to that:

  • An existing maintainer needs to look into the initial PR of a distribution to make sure that the name/parameterization/documentation of a distribution is appropriate and that it's the distribution it's purported to be (e.g. tests of type 1 in Unit Tests #5, which can't be automated).
  • I'd also like to get confirmation from contributors that they agree to maintain their contribution. We can't enforce that, obviously, but there should be some sort of plan. I don't think we should accept distributions that people plan to abandon.
  • Once a distribution is accepted, the contributor can get commit rights and can adjust their distribution as they see fit. They can merge contributions from others, and as requested, we can add the others as maintainers. Of course, if others notice something strange, we can check in with the authors (and commit rights can be revoked if necessary).
  • Once someone has commit rights, we'd expect them to (initially) excercise those rights only on the distribution(s) for which they earned commit rights. Those who have had their contributions to many distributions reviewed and accepted can start to branch out more, though.

Also, sometimes automated tests won't pass and it's due to limitations of numerical integration, etc., rather than the distribution. Someone needs to decide what to do in such cases.

Does that make sense? In that case, how should we structure this on GitHub?

@mdhaber
Copy link
Collaborator Author

mdhaber commented May 17, 2025

@tupui We discussed a meta-repo design at the summit: scientific-python/summit-2025#40 (comment). @sanketverma1704 offered to help prototype.

@sanketverma1704
Copy link
Member

Hi @mdhaber and @tupui.

We discussed the repository design for scikit-stats and various distributions. My idea was inspired by STAC Extensions, which has worked great for their community so far!

STAC Extensions offers a template repository that contains a set of minimal files for creating a new extension. A person would click on the 'Use this template' button, and a new repository would spawn, allowing them to create and share their desired distribution.

The extensions listed on the website contain necessary information. I suppose we could have something similar for the distributions.

Some extensions are hosted under the user's personal GitHub instead of the STAC Extensions GitHub org, but I think we do not want that currently.

From scientific-python/summit-2025#40 (comment):

The biggest drawback of a meta-repo, though, is the potentially hundreds of separate packages on PyPI and conda-forge, distributed with prefixes like skstats.distributions.. We know conda-forge is OK with this, but not sure about PyPI.

This is a valid concern. We discussed different approaches, but I'd be interested in hearing what @jakirkham thinks.

@mdhaber
Copy link
Collaborator Author

mdhaber commented May 22, 2025

This is a valid concern. We discussed different approaches, but I'd be interested in hearing...

@jakirkham mentioned in person that this was fine for conda-forge and I think gave examples where it is currently done in PyPI. @jarrodmillman offered to ask the PyPI folks in person.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants