-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Going faster on the JVM #51
Comments
queue/stack is needed, since the interceptor could modify the execution queue to provide dynamic behaviour. e.g. routing interceptor is base on the ability of modifying execution queue. |
@zerg000000 since the queue/interceptors keys are known at compile time, couldn't the interceptor chains be precompiled just as the router is precompiled? The only case where you'd need a queue is if you injected it programatically at run time. How common is that pattern? |
Yes I read that implementation before posting my reply. |
Doing the composition sounds like a good idea to improve the performance. Maybe it could be decoupled from the asynchronous implementation by providing a What comes to the async aspect, reactive streams compliance could be something worthwhile looking into. |
I looked into sieppari's performance on the JVM and ran some profiling to get a sense of what exactly needs to be optimized to get on-par with middleware
The measurements and results can be found here
Generally, CPU utilization can be broken down to:
With native queue and stack and eliminating all keyword access (very ugly code) I was able to get a speed up of 2-3x. The remaining CPU went to the queue and stack mathods,
MethodImplCache
lookup, Context instantiation (immutable record) and-try
. Still very far from just composing 10 functions.To get that performance we need a "compiled" model which among other things gets rid of the queue and stack and perhaps creates a mutable context for each execution
If I understand the execution model correctly (sans the reified queue) the execution flow for three interceptors would look something like:
This can be implemented in two different ways, besides the queue/stack implementation:
Regarding async, #9 suggests it might be handled incorrectly at the moment. An option to consider is choosing a unified execution model and doing everything in it. Or should it be determined by the executor?
We should also figure out which parts of the context can be mutable and are the execution environment and which are data. If we forgo the option of exposing the runtime environment (queue and stack) we can go much faster.
There could be a hierarchy of Runner -> ExecutionContext -> Context
There are a lot of considerations and degrees of freedom which can lead to widely different solutions and I'd like to open them for discussion and hopefully experiment with different solutions.
Would love to hear your thoughts and to take a swing at it myself.
The text was updated successfully, but these errors were encountered: