Skip to content

Conversation

qkaiser
Copy link
Contributor

@qkaiser qkaiser commented Jul 1, 2025

Draft implementation for #1216

It fails on test_archive_success because we cannot instantiate ExtractionConfig with an arbitrary tmp directory right now.

Some unit testing is also required for proper coverage.

@qkaiser qkaiser force-pushed the allow-tmp-access branch from 9b4616a to 0cad9c5 Compare July 1, 2025 12:21
@qkaiser qkaiser linked an issue Jul 1, 2025 that may be closed by this pull request
@qkaiser qkaiser self-assigned this Jul 1, 2025
@qkaiser qkaiser added enhancement New feature or request help wanted Extra attention is needed python Pull requests that update Python code labels Jul 1, 2025
@qkaiser qkaiser marked this pull request as draft July 1, 2025 12:22
"""Set environment variables so all subprocesses and handlers use our temp dir"""
for var in ("TMP", "TMPDIR", "TEMP", "TEMPDIR"):
os.environ[var] = self.tmp_dir.as_posix()
atexit.register(self._cleanup_tmp_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is problematic for programs using unblob as library, as these variables remain set in the hosting process's environment. We should set and reset them in a tighter scope.

For example, Processor.process_task looks like a fine candidate, as it by default will spawn child processes, not poisoning the caller's environment at all (given they use process_num > 1).

Another possibility is doing this at process_file level, but it still affects the hosting process unfortunately, but at least we can restore the variables upon returning from the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also implemented tests in test_necessary_resources_can_be_created_in_sandbox

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see the original comment being resolved in the code - below I can find the env variables being set, but not restored, and here atexit.register still being used.

Have you pushed your changes?

I am also not sure how it would landlock to a /tmp/call-specific-tempdir if used as a library on the second call with another ExtractionConfig.

One solution I can think of is process_file forking in case of landlock, setting the environment variables in the fork, and cleaning up the temporary directory when the fork exits. Since the environment variables would be set in the child, they need no restoring.

(actually multiprocessing should be used somehow, as https://docs.python.org/3/library/os.html#os.fork has many problems and known not to work on macos)

@qkaiser qkaiser force-pushed the allow-tmp-access branch from 0cad9c5 to d6d9358 Compare July 2, 2025 05:19
@qkaiser qkaiser marked this pull request as ready for review July 2, 2025 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

landlock sandbox: allow access to /tmp
3 participants