-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Please make trove classifiers validation optional #4459
Comments
See also: urschrei/pyzotero#178 |
Making it into a warning rather than error would also work for us. |
After reviewing the code, I can add some detail. The escape hatch was created to avoid network access for the fallback behavior when the I'm a little reluctant to add yet another escape hatch for this one configuration item especially for this rare case that's likely to resolve itself over time. I'd be more inclined to just remove support for the The suggestion to make it a warning instead of an error seems reasonable, but my guess is that most users will miss the warning and only encounter it when attempting to upload to an index anyway, suggesting maybe the validation should be removed altogether. But, the validation logic isn't maintained here, it's maintained instead in abravalheri/validate-pyproject. @abravalheri What do you think of the proposal and the alternatives? |
The fact that it's stale in this case indicates we never intended trove_classifiers to be valid at all, but we build without virtualenv isolation.
It's not going to resolve itself over time, because due to dependency graph ordering there's no real reason:
for trove_classifiers to end up queued for installation first. Instead, updating fails, and continues to fail because every time a system update is attempted, it fails again. If setuptools removed its validation entirely, or made it only use the network, that would equally solve the gentoo use case though. :) We already disable networking anyway. Of course, one might ask, why not just build in an isolated environment without trove_classifiers? That's recommended by PyPA via virtualenvs. But distros generally don't do this, because they already have both network isolation (cannot install build-requires with pip) and clean build isolation via spinning up temporary containers with just distro-packaged dependencies installed. And distros want to use their own packaged dependencies for various reasons. ... but Gentoo (can, and usually does) build from source on every user's machine, and spinning up containers for every package is a hassle, slow, implies nontrivial setup, and generally is impractical. Distros like Debian and Fedora have complex distributed systems to do this and reset to a clean slate in between each build, so it doesn't matter if it takes 15 minutes to build each pure-python wheel because they have tends of buildbots building thousands of packages, and each one only has to get built once. And most other cases of additional packages installed in a build environment aren't a problem. ;) Build isolation primarily solves the problem of people forgetting to specify everything the build depends on because they already have it installed. So gentoo usually does what those other distros also do when individual users want to build a .deb or .rpm, and builds on the live system, and relies on And so we end up here, asking for, well, a |
Yeah, I've noticed that earlier and I was just about to open a pull request there. Good that I've checked mail first, let's see what resolution is preferred.
Do you mean one that fetches data straight from PyPI? As long as we can force-disable Internet access, that would work for us. |
I would love to take this approach, but the PyPI Web page for listing classifiers is not completely official. I know that My view is that this is a tricky problem, but also a minor one... Once the latest trove-classifiers package is adopted downstream, the problem would naturally go away, right? Also, the trove-classifiers package should be mostly backwards compatible, so the risk of adopting a new version is very low... A question: if the problem is the installation order is it possible to make trove-classifiers a build-only dependency of pyzotero? That would ensure that the latest classifiers are installed right? |
I'm pretty sure I was quite clear that the problem does not naturally go away ever, because the solver gets into a stuck state and doesn't get unstuck until someone manually kicks it -- because it will keep trying to install the failing package before trove_classifiers. That usually means patching the setuptools distro package to have a hard dependency on It would be much better if we could unstick things automatically by telling setuptools to stop performing validations we don't want. We aren't uploading to PyPI. It's confusing anyway because there's absolutely no reason people shouldn't be allowed to use whatever trove classifiers they like, even unofficial ones, on private self-run indexes. |
Just want to reiterate here that adding trove-classifiers as build-requires for arbitrary packages throughout the python ecosystem is not a very comfortable solution in the slightest. Especially because trove-classifiers may not need to be installed at all -- and in fact the only reason I have it on my system is because hatchling required it. It is terrible UX to have pyzotero require an additional build-requires, solely because if it doesn't have that build-requires then installing hatchling will result in setuptools erroring out. It's not even the kind of bug that anyone will notice for a while. |
Hi @eli-schwartz, thank you very much for the reply. I understand your concerns and appreciate your feedback. Please note that I am not suggesting to add Also note that when I suggested that the "problem eventually goes away" I was referring to "greenfield" deployments in the future when the I understand that you guys have your reasons to disable build isolation, but I would like to point out that build isolation was introduced to fix this kind of problem1... It is perfectly fine for OS-maintainers to attempt to optimise builds, but I think that in this case when the lack of build-isolation is directly involved in a bug, it would be important to explore other workarounds (like the inclusion in this particular package of I aim to keep the codebase as simple as possible, which is why I’m hesitant to add yet more knobs and levers. However, one potential solution could be introducing an environment variable like Regarding the idea of adding Footnotes
|
This is indeed a long-standing flaw in the python ecosystem, yes. :) Automagically doing the wrong thing based on unpredictable environment inputs is bad, and everyone agrees on that. Due to several decades of legacy baggage, it is difficult to solve this for setuptools because the automagic environment inputs serve multiple roles including extensibility by way of entrypoint-based plugins. Build isolation is not, and never has been, a solution to this engineering problem. It's a way to avoid the side effects in a scenario where the design constraints prevent a solution from being achieved. It is also, largely, a problem that only affects setuptools and doesn't affect other build backends which take a more structural approach to defining builds. You may wish to take note here of the wording for PEP 517. It describes build isolation and marks it as an RFC 2119 compliant "SHOULD" feature, with two PEP rationales:
It doesn't mention the possibility of additional environment packages modifying your own build requirements, indicating that this isn't a motivational concern behind the recommendation to use build isolation. |
Just for clarity, my mention of I suspect environment variables is ultimately easier for setuptools, if only because it doesn't require passing state around through multiple layers of "execute the setup.py script together with a phase to run". The thing that I care about is the Autoconf model where functionality is expected to have some form of on/off switch rather than unconditionally probe whether an import is available. |
I was not notified that you opened a PR eventually, and it appears that despite my attempting to explain why the entire thought process behind "skip the validations altogether" was based on false promises, that's the approach that ended up used anyway. This is unfortunate and in my opinion you haven't actually solved my bug report. I asked to make validating trove classifiers optional, I didn't ask to make validating the entire file optional. In particular, the fact that the fix you merged emits a SetuptoolsWarning feels mean-spirited. I understand why the warning exists, of course. Since it involves skipping ALL validation including basic consistency checks of load-bearing information that setuptools relies upon to successfully build a wheel (as opposed to trove classifiers which are anonymous display-only data strings where validation is only required when uploading to PyPI, and which PyPI already performs on upload), using that option is in fact dangerous in ways that far exceed the perfectly safe request which I had originally made. But that's really the point, isn't it. The reason the SetuptoolsWarning feels mean-spirited is because the whole approach to closing this ticket feels mean-spirited. I asked for a specific change, and provided specific reasons for that change. You disagreed with my reasons (which is your right) and engaged in dialogue about why you disagree (I'm happy to have a discussion about the topic -- maybe one of us will convince the other) and then a month later without notifying me (?) or asking me for feedback (???) created a PR implementing something I didn't ask for that is designed in a manner optimized to make sure that I wouldn't be willing to use it (??????????). Not only that, but you raised a question for feedback on that PR, and waited several months for practically nobody to submit feedback since nobody knew the PR existed. You plainly wanted the feedback, so why... hide its existence from the person who opened the ticket. If you dislike my request that much then please be straightforward about it and close it as "WONTFIX". |
@eli-schwartz I am sorry this did not fix your problem. Indeed, I was not keen to take the approach you suggested by the series of reasons I have already explained previously. But I was genuinely concerned that there was no escape hatch for scenarios involving problems with "accidental extra dependencies" for Please note that in #4459 (comment), I laid out the one potential solution I was willing to implement and the PR pretty much followed it. So, the choice of implementation should not come as a surprise. Personally, I did not see other viable alternative, but in the same comment I welcomed PRs implementing alternative approaches, because I belive that different people will have different insights and different solutions that can be very valuable to solve the problem. But I needed to see a concrete implementation, not only debate general ideas on how things could ideally be, since feasibility was one of the concerns. Regarding the criticism about not pinging you: by no means I was trying to hide anything from you (well, I pretty much told you about the one implementation that I was willing to put together before doing so). I simply don't like to ping people if I can avoid it. Life is already too full of notifications. Besides, doesn’t Github already do the notifications? In general, I do write PRs and leave them open for a while, so the entire community has the chance to have a look and provide feedback (unless I feel it is urgent or I need it for something else). In the future, when you want to be pinged about a specific implementation, please indicate that, and I will be happy to do so. |
Anyway, now we have an escape hatch for "spooky action at distance" in |
Github does notifications, but I wasn't subscribed to the PR. It's okay to leave a comment on an issue saying "I've posted a PR at XXXX", no I'm subscribed to this ticket -- why would I not want to get an email notification about new developments such as a proposed solution? If I didn't want email notifications, I can always submit a ticket and then unsubscribe. New comments to a ticket I'm subscribed to, don't even do anything overly noisy such as create an android system notifications event (from the GitHub App) the way an overt
Well, if you're solving a problem that isn't the one reported here I'd personally say it's a bit clearer to implement that change separately, and then close this one as "Won't Implement", if it is indeed not going to be implemented... |
Thanks for the info on Github notifications. Because github does add a note to the issue thread when a related PR is implemented, my understanding was that some sort of notification is available. But I see now that I was wrong.
That is a fair comment. I will keep that in mind for the future. Note, however, that the 2 problems are not exactly distinct in my understanding and that the escape hatch provided is useful to workaround the originally described problem. It might not be exactly how you wanted it, but well, it is what I could do. Finally, please let me know if you prefer the environment variable to be removed. I will also wait for |
The main issue is that it replaces one form of spooky action at a distance with a slightly less distant, but possibly even more spooky, action. As the warning indicates, ignoring all validation for all configuration keys can lead to deeply broken wheels that don't even function as usable wheels. I daresay it would have to be quite an emergency before I'd consider using that. But I don't actually know under what circumstances you think it would make sense to skip validation of the pyproject.toml format (because one does not wish to see validation errors for the format) but does want to attempt to build a wheel using that invalid configuration. It's entirely different, and significantly more reasonable, to want to skip "validate that this wheel doesn't violate PyPI's policy for uploading wheels to https://pypi.org which requires to limit yourself to search tags (i.e. trove classifiers) that have been agreed upon as common enough to be worth sorting on". Why should setuptools be enforcing an extremely arbitrary time-sensitive upload policy for a single hosting site, especially when you aren't using that hosting site yourself?
I offered a very specific targeted suggestion. I suggested that instead of setuptools/setuptools/config/_validate_pyproject/formats.py Lines 156 to 214 in 34f9518
could check for
and rather, the That's not a general idea. That's a pretty concrete suggestion. With specific regard to:
You did no such thing. I mentioned a tangent about --enable-foo / --with-foo, as an explanation for why Linux distros (all of them!) disagree with and reject the notion of build isolation. The usefulness of such things is a broadly applicable topic related to things such as You said you welcome concrete proposals for implementing --with-foo, a topic I don't have concrete proposals for as it would also require coordinating with the PEP 517 build backend interface among numerous other things, and might require redesigning how setuptools processes cmdclass phases as well.
I only saw you state a single reason, which was:
You are of course welcome to that viewpoint! It's your project (half yours?) and you're responsible for maintaining it. I respect your desire to have a simple codebase and avoid adding "too many knobs and levers". While, at the same time, also desiring for my part that communications around what you as a setuptools maintainer are comfortable with, are transparently done, including, sometimes, saying "sorry, I would rather not do this" and closing an issue as "Won't Implement". And also, that if you're going to use that as the reason to reject something, you not say "for the series of reasons" when there is not, in fact, any such (1 is not a series). I'm genuinely startled that you implemented Footnotes
|
@eli-schwartz I appreciate you commenting on the topic, but I think we are starting to go to far. There has been too much miscommunication, and the messages are being lost and misinterpreted both sides. I do not wish to keep this going. Could you please indicate if you wish |
I would suggest removing it, yes. I wouldn't be using it myself, because it is dangerous. |
Also, since you mentioned "concrete suggestions". I was originally hesitant to write code for my concrete suggestion because:
My original concrete suggestion now exists in code form, so you can see what I mean by example: eli-schwartz/validate-pyproject@f568d59 I would also argue that with that, the existing support for $NO_NETWORK could just be dropped. But I'm not invested in that. Honestly, I would also argue that the entire validate_pyproject logic for doing anything other than checking that it is an array of strings should be outright deleted, because it is better served by pypa/twine#430 and pypa/twine#1166 and has no role to serve in validate_pyproject. |
I have two concerns with it:
So yeah, I would very much prefer that it was possible to disable the part of validation that actually causes problems, rather than all of it. |
|
In #4746 I propose to revert the change that introduced
Thank you very much @mgorny, just not that at this stage the PR is only for removing |
Thank you very much @eli-schwartz for having a look on this and providing a contribution. I appreciate the effort you’ve put into proposing a solution and raising important discussion points. To ensure we use our efforts effectively, I’d like to clarify the approaches I would consider moving forward. If I understood the implementation in eli-schwartz/validate-pyproject@f568d59 correctly, it introduces branching, adds one level of nesting and adds another environment variable that affects how classifiers are validated. This is an approach that I have considered before as a result of the earlier discussion in the thread, but decided to not follow. My resistance to follow this approach is what I tried to convey when I previously said Regarding
I understand your perspective, but I believe that ensuring we do not produce wheels that will be rejected later is an important role for The reason why I initially suggested At this point, I don’t think there is a good way to reconcile the request with my personal requirements for the source code. That is why I previously said that something like Footnotes
|
The way I would look at it instead is that both of those requests for disabling bits and pieces are happening because people feel that trove classifiers, specifically, and nothing else, is something that is "not suitable for setuptools to check". It's not increasing requests for disabling bits and pieces at all. It's just people who don't want to validate the approved list of allowed classifiers.
It's all just requests by people who don't want to check trove-classifiers. Some of those requests were by people who did not care to solve the unpredictability issue except for their own specific scenario where the unpredictability rears its head. But at the end of the day, it's still specific to trove-classifiers. The logical conclusion here is that instead of having various different ways to disable trove-classifiers checking via bits and pieces here and there under specific scenarios, there should just be a way to disable trove-classifiers. Anyone using NO_NETWORK is doing it because... they don't want to validate trove-classifiers. So if they just used VALIDATE_PYPROJECT_NO_TROVE_CLASSIFIERS, they... get exactly the behavior they want, right? That's why I said the next logical step is to delete the "old" environment variable. If people worry about it being a regression, just... document it and bump the major version? |
Thank you very much @eli-schwartz. In the interest of complying with your request of being as much forthcoming as possible, I maintain that I will not be pursuing such routes, as I commented in #4459 (comment). |
Thanks for clarifying. Minor nit: can you use the "close issue as not planned" toggle for searchability reasons? |
I will leave the PR #4746 open for one week (max) to receive feedback strictly on if it should be merged or not. The aim of that particular PR is to remove I am taking this approach because my understanding is that In the case no feedback is provided by next week, I will use Github search to identify if |
As requested in pypa/setuptools#4459, add a VALIDATE_PYPROJECT_NO_TROVE_CLASSIFIERS environment variable that can be used to disable using trove_classifiers package even if it is available. This can be used when the system features an outdated trove_classifiers, and therefore incorrectly triggers validation error. The change is designed to be absolutely minimal and non-intrusive.
… strings Trove classifiers, and their officialness, have no effect on a wheel other than determining whether they are allowed to be uploaded to a non-Gentoo website, and enabling the search index of that other site. We don't need this, and we don't need to validate it. Setuptools will disable validation if both of: - network downloads failed - cannot successfully import the `trove_classifiers` module occurs. If trove-classifiers is installed by coincidence, this breaks builds when it doesn't get updated on an extremely rapid basis and some random package in dev-python/* uses a classifier that was made official just the other day. We could solve this another way, by making dev-python/setuptools PDEPEND on trove-classifiers, and constantly bump the >= dependency. But this is a pointless hassle. In fact, we're actually doing it, and it's been a pointless hassle. We need to maintain up-to-the-minute minimum bounds on the very latest version, and bump setuptools to a new -rX just to update the minimum version of a package it doesn't even depend on. We need to package new versions of trove-classifiers before *other* Gentoo Devs outside of the python project, can successfully revbump their own packages. We need to coordinate stabilization of trove-classifiers in combination with those other packages. We force people to install a pointless package. We overuse PDEPEND. Instead, apply a *rejected* upstream patch to add an environment variable that skips this specific validation code block entirely. Upstream doesn't want to maintain code that contains branches, so we will maintain it locally. Since it is Gentoo-specific, the variable is also prefixed with GENTOO_ and is expected to be used solely inside of distribution packaging while not affecting manual usage of setuptools outside of portage. Bug: pypa/setuptools#4459 Signed-off-by: Eli Schwartz <[email protected]>
…d strings In the previous commit, a change was patched into setuptools to enable skipping pypi.org specific validations we do not want. Export the environment variable which activates this, whenever the build backend is setuptools. Bug: pypa/setuptools#4459 Signed-off-by: Eli Schwartz <[email protected]>
setuptools version
from git, also 70.0.0
Python version
Python 3.12
OS
Gentoo Linux
Additional environment information
No response
Description
I tried building a wheel for a package that uses a trove classifier in the very latest version of the trove_classifiers package. Setuptools detected that trove_classifiers was installed, and insisted on validating it.
Looking at the code which handles this, it offers an escape hatch only to "avoid hitting the network" if trove_classifiers is NOT installed:
setuptools/setuptools/config/_validate_pyproject/formats.py
Lines 156 to 214 in 34f9518
Expected behavior
A way to disable the unwanted functionality without first uninstalling other packages. For example, instead of $VALIDATE_PYPROJECT_NO_NETWORK, I would like to see $VALIDATE_PYPROJECT_NO_TROVE_CLASSIFIERS.
How to Reproduce
Output
The text was updated successfully, but these errors were encountered: