Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix spyx_tmp() cleanup. #39201

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

tornaria
Copy link
Contributor

For some reason, a new behaviour of python 3.13 [0] causes the
TemporaryDirectory() in sage.misc.temporary_file.spyx_tmp()
to be deleted on child exit, which causes trouble with parallel
doctesting [1].

We rewrite spyx_tmp() using tmp_dir(), which doesn't have this
problem, see [2].

[0] python/cpython#114279
[1] #39188 (comment)
[2] #39188 (comment)

In addition, use sage_ prefix for the main temporary directory.

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.

Copy link

github-actions bot commented Dec 24, 2024

Documentation preview for this PR (built with commit 9669a86; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

@orlitzky
Copy link
Contributor

I think this works because atexit.register() is called for TMP_DIR_FILENAME_BASE when the module is imported (in the parent test runner process) rather than whenever _spyx_tmp is needed (in some child).

It should be fine but can you please leave a comment explaining why tmp_dir() was used here? Otherwise I will forget and try to replace it in two years.

For some reason, a new behaviour of python 3.13 [0] causes the
`TemporaryDirectory()` in `sage.misc.temporary_file.spyx_tmp()`
to be deleted on child exit, which causes trouble with parallel
doctesting [1].

We rewrite `spyx_tmp()` using `tmp_dir()`, which doesn't have this
problem, see [2].

[0] python/cpython#114279
[1] sagemath#39188 (comment)
[2] sagemath#39188 (comment)
@tornaria tornaria force-pushed the python-3.13-spyx_tmp branch from 1e5947a to 9669a86 Compare December 24, 2024 23:14
@tornaria
Copy link
Contributor Author

I think this works because atexit.register() is called for TMP_DIR_FILENAME_BASE when the module is imported (in the parent test runner process) rather than whenever _spyx_tmp is needed (in some child).

I don't think so:

  • In fact spyx_tmp() is called first from the main process (before forking)
  • Adding a long delay in the atexit handler shows that the directory is deleted way before the atexit handler calls d.cleanup().

The only reason you got away with this is because on python 3.12 and earlier python objects are not finalized when a child exits, skipping the side effect of the TemporaryDirectory.

It should be fine but can you please leave a comment explaining why tmp_dir() was used here? Otherwise I will forget and try to replace it in two years.

Ok, done.

@user202729
Copy link
Contributor

While we're at it,

    Create and return a temporary directory in
    ``$HOME/.sage/temp/hostname/pid/``

This docstring is very wrong. Looks like the implementation has changed long ago. (but the # random makes the change goes undetected)

atexit.register(lambda: d.cleanup())
# We don't use `tempfile.TemporaryDirectory()` here because it
# will be cleaned up on child exit (e.g. for parallel testing)
# For some reason this doesn't affect the `TemporaryDirectory`
Copy link
Contributor

@antonio-rojas antonio-rojas Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using tempfile.TemporaryDirectory() works just fine for me, so this comment may not be completely correct (however, it is not cleaned up on exit of the parent process)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.. so you mean TemporaryDirectory() is not cleaning up on program exit in general, or only in sage.misc.temporary_file.spyx_tmp()? What about sage.misc.temporary_file.TMP_DIR_FILENAME_BASE?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only on spyx_tmp(). The TMP_DIR_FILENAME_BASE one is cleaned up correctly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there seems to be something fragile about using TemporaryDirectory() on a local variable that is meant to survive its function. There are no clear rules about when this object will be destroyed (apparently the lambda: d.cleanup() grabs a reference that stays in the atexit hook, but I'm not sure if the lambda takes a strong reference).

I guess the "right" way to acomplish what was intended here might be to store d in the global variable, and not just d.name.

@user202729
Copy link
Contributor

user202729 commented Dec 26, 2024

I… still don't understand why this patch may work.

  • in python3.13, by default, atexit registered after fork will be called, atexit registered before fork will not be called.
  • But we're using restore_atexit, so the atexit registered to delete TMP_DIR_FILENAME_BASE should also be called (?)
  • so why does it make a difference?
  • besides, isn't it desirable for each child process to use a separate temporary directory anyway? (granted, before Python 3.13, if each child process use a separate temporary directory then it will be impossible to clean them up. But in Python 3.13 it's possible)

Though it sound like a good idea to put all the temporary stuff into the same place anyway.

@antonio-rojas
Copy link
Contributor

  • in python3.13, by default, atexit registered after fork will be called, atexit registered before fork will not be called.
  • But we're using restore_atexit, so the atexit registered to delete TMP_DIR_FILENAME_BASE should also be called (?)

No, because restore_atexit is called with run=True. From the docs:

    - ``clear`` -- boolean (default: equal to ``run``); if ``True``, clear
      already registered atexit handlers upon entering the context

@tornaria
Copy link
Contributor Author

I… still don't understand why this patch may work.

* in python3.13, by default, atexit registered **after fork** will be called, atexit registered **before fork** will not be called.

* But we're using `restore_atexit`, so the atexit registered to delete `TMP_DIR_FILENAME_BASE` should also be called (?)

* so why does it make a difference?

IIUC, restore_atexit is a custom mechanism to make sure atexit works for children (as in: atexit hooks will be called on child exit). It seems with python 3.13 it might not be necessary anymore.

But my problem is not the atexit hook removing the directory. It's the weakref.finalize for the TemporaryDirectory that will remove the directory, and this is something that restore_atexit doesn't do but python 3.13 does (call the weakref.finalize stuff when a child exits).

Arguably, a bug in TemporaryDirectory: if the object was created in the parent, the side effect (removing the directory) should happen on parent exit, but not on child exit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants