Skip to content

Use layered class loaders rather than "shared" class loader hack for class loader isolation #5155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

alexarchambault
Copy link
Collaborator

@alexarchambault alexarchambault commented May 19, 2025

This PR makes Mill start its daemon with the help of coursier bootstraps, to isolate some Mill dependencies. These bootstraps are generated so that they load Mill using three class loaders:

  • a first one with org.scala-sbt:compiler-interface and org.scala-sbt:test-interface
  • a second one, with the first as parent, that loads the core-api module of Mill (alongside its dependencies)
  • a third one, loading the rest of the Mill class path

The first class loader is the parent of the second one, and the second one is the parent of the third one.

When loading scala-compiler JARs (in JvmWorkerImpl#scalaCompilerCache) or test frameworks (like in TestModule#bspBuildTargetScalaTestClasses, to discover tests from within the Mill process, but it's also done in other places), Mill uses the first class loader as a parent loader for the class loader that loads scala-compiler or the class path that has tests. That way, the compiler-interface and test-interface classes are shared between Mill's internals and scalac or the test class path.

When loading the build class path in MillBuildBootstrap#processRunClasspath, Mill uses the second class loader, so that the core-api class path is shared between Mill's internals and the user build, but the rest of the Mill class path is hidden from user builds.

This has two benefits, compared to what's done without this PR:

  • avoiding mistakes or issues that can arise with the use of package prefixes to hide or show classes (like done here)
  • make the class loader hierarchy more "standard", consisting solely of URLClassLoaders (and the JVM's app and bootstrap loaders), so that it's easier to inspect by users if they need to

Initially, these development were motivated by suspicious things seen around SystemStreams.ThreadLocalStreams.current involving thread local variables, in #5154, but it turns out the changes in the PR here are not necessary to address these issues.

So this PR only makes things safer or more standard, but it isn't required for any feature as I initially thought. If you've seen anything suspicious with the handling of class loaders, it might be helpful…

@alexarchambault alexarchambault marked this pull request as ready for review May 19, 2025 16:38
Comment on lines 220 to +222
def createClassLoader(
classPath: Iterable[os.Path],
parent: ClassLoader = null,
parent: ClassLoader = Thread.currentThread().getContextClassLoader,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think better to keep this default as is; flipping the default will likely cause a lot of confusion since the code will all compile and run but possibly fail with subtle classloading issues

if (locks.daemonLock.probe()) serverProcess = initServer(daemonDir, locks);
while (locks.daemonLock.probe()) {
if (serverProcess != null && !serverProcess.isAlive()) {
System.err.println("Mill server exited!");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is just debugging logging that should be cleaned up?

sharedLoader = classOf[MillBuildBootstrap].getClassLoader,
sharedPrefixes = Seq("java.", "javax.", "scala.", "mill.api")
)
val hasLayeredClassLoader =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These conditionals are new, and seem to against the goal of making the classloader hierarchy easier to understand. Why do we need them now, if we did not need them before?

@lihaoyi
Copy link
Member

lihaoyi commented May 19, 2025

Is it possible to set up the layered classloaders without using coursier's bootstrap infrastucture? It seems to be straightforward to do it "manually", as we do a similar thing in TestRunnerMain.java, and if it's not difficult it would be nice to have the logic encapsulated here rather than calling out to a third-party library

Overall the high-level goal of this PR seems reasonable, but the details of the PR seem to contradict the notion that this makes the classloader setup cleaner or easier to understand. At first glance, this PR seems to introduce a lot more complexity (including usage of arbitrarily-complex third-party library logic in coursier-bootstrap) as well as more confusion in the Mill codebase (e.g. the if conditionals scattered throughout the codebase). Not sure if that is fundamental or incidental, but if it's incidental perhaps after some cleanup we could demonstrate an improvement in clarity over the status quo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants