-
Notifications
You must be signed in to change notification settings - Fork 439
Improved Scala.js bundle size for Tapir #4461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Thanks for the PR. My only worry here would be whether this adds any performance overhead. Reading a lazy-val always results in a memory barrier (see https://dotty.epfl.ch/docs/reference/changed-features/lazy-vals-init.html and https://github.com/scala/scala3/blob/fa340188cc0c06574a707df02d6e55562a4bf63e/library/src/scala/runtime/LazyVals.scala#L147 - there's a volatile read), but then the question is if reading the codec/schema values is on the hot path. Since the schemas / codecs are used only when creating endpoints, under normal circumstances I think accessing them should only happen once at program startup. However - there might be use-cases where the endpoints or schemas are generated dynamically, which could hurt performance. But maybe this use-cases need to be fixed anyway ;) Maybe @plokhotnyuk has some experience here? A lower-risk change would be only to make the time-related schemas/codecs lazy. Would that help in your scenario? I think it would also be to add a general comment to |
I think there is a definite benefit to allowing for the dropping of any unrelated codecs, but if there is a perf hit then the focus could shift to just the most expensive codecs; but I agree that this should be a minimal hit, as the codecs shouldn't be accessed by lazy val often. I can add the comments, sure |
The dropping of unnecessary codecs relates to Scala.JS only, while JVM is the predominant platform (especially on server-side, where perf requirements are higher). Hence my hesitation no to introduce any performance regressions |
For long-running server applications, I imagine the impact is minimal; it would be good if someone can verify that, of course. The cost of handling a request is surely far higher than the cost of a lazy val lookup (and, as we suspect, this will be a one-time thing at start-up) even if it is done every request. This could be special cased for JS, of course, but this would be a much higher maintenance burden. For |
@j-mie6 I don't think we have the bandwidth to run performance testing right now, could you maybe create a second PR which would only include the changes scoped to the main offenders, i.e. the |
Very well, I'll see if I can get that done today
|
From my experience with server-side applications, I wanted to highlight a couple of potential considerations when broadly replacing
As a compromise, we could focus the lazy val optimization specifically on values that are initialized using |
This PR simply makes all
implicit val
s within Tapir uselazy val
, which is a binary-compatible change which impacts the bundle size for Scala.js uses of the library.The background is that Tapir has a lot of instances for things like
Schema
which are implemented for many types, with a prime offender being any time related things fromscala-java-time
. These have a huge footprint of 632KB for just time, and then around a 200KB footprint for other things. The javascript compilation will remove any functions that are unused within a code-base from the bundle, which reduces this footprint. However, it cannot removeval
s. By making the implicit instances intolazy val
, they are transformed by the compiler into a pair of function and value (initialised tonull
). If the instance is unused, the compiler will be able to shake away the functions that would generate instance, leaving just an unshakeablenull
in its place.On my own site, this provided a substantial bundle size improvement, and eliminated the
scala-java-time
dependency entirely.