-
-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory corruption during pdm sync #2940
Comments
I've reproduce this a few times in ~1000 attempts. I have reproduced it on a couple of different versions of Python, ranging from 3.10 to 3.12, and both inside a container that derived from the official Python Docker container and on my host system, an Ubuntu 22.04 system running Python 3.12 from the deadsnakes ppa. Reproduced by running the following in a project that has exhibited the issue:
This eventually produced:
The project is proprietary, and pointing at our internal pypi repository, hosted on a GitLab package registry. Working on minimizing the repro example to see if I can do it without those factors. |
Great reproduction @lambda We have experienced a very similar symtom.
I am attempting @lambda 's brute force attempt at local recreation and will update if I can get it. Edit - no luck in reproduction. I'll leave it running over lunch. |
Ah, nice to know that we're not the only ones hitting this. Seeing this new comment prompted me to try again, and I just reproduced it again, this time with no reference to internal packages or our internal package registry. Here is my most recent
pdm.lock contents
And here's the result; amazingly, I hit it on the first iteration of this test, but it frequently takes hundreds or thousands of runs before I hit it.
|
Aye, I had no luck on reproduction, but that's a good job @lambda . So establishing
Being such a low-level error and PDM apparently being pure-python - somehow it's triggering something quite low down (seemingly in python itself?). That takes it a bit out of my skillset to debug more deeply. |
@lambda I cannot even install your project because it fails with pyyaml :/ Apparently there is a dependency issue between pyyaml and cython. I will try python 3.10 instead of 3.12 which has wheels for pyyaml…
|
So here are some further thoughts while I try to reproduce:
|
Further things that would be helpful:
If the segfault is still triggered afterwards we are already in a better position… Although if it is a memory corruption the backtrace is most likely useless. But maybe it is always at the same location and gives some hints? |
@lambda @mikecowie-seequent, our team is experiencing this issue as well. Were you guys able to figure out the cause? |
@FaitAccompli , no sadly, it's something we've been tolerating as contributing to CI flakiness . It's rare enough to not be a priority... I might well be imagining it, but it feels like I might have seen it less in the past few weeks? |
Yeah, we weren't able to reproduce it reliably enough to track it down; it's still happening, but so infrequently that reproducing it and getting better debugging data is really elusive. |
So ... just to update, we stopped tolerating this and moved to uv ... we haven't looked back. |
Make sure you run commands with
-v
flag before pasting the output.Steps to reproduce
Twice in the past 24 hours, I have seen PDM exhibit a memory corruption related crash in my CI jobs.
I do not yet know how to reproduce this issue reliably. Re-running the same job a second time can work successfully; this problem is not deterministic. Filing it in case there might be something suspicious that you can think of, and to provide a place to keep track of further instances to collect more information.
In a Python 3.11 CI job yesterday:
In a Python 3.12 CI job today (for a different project):
Actual behavior
PDM sometimes dumps core due to memory corruption issues.
Expected behavior
PDM doesn't crash and installs the files.
Environment Information
Here is how PDM is installed and set up in these CI jobs:
The text was updated successfully, but these errors were encountered: