-
-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk events performance experiments #3307
Comments
|
"Really I think the only way this can be solved is by fetching the current streams version prior to applying events. I know this undermines a lot of the performance of quick append but i don't see a better way to do this Like you said, that 100% defeats the purpose of the quick append. Are you saying this based on observed behavior, or by guessing looking at the code? I think the way that the revisions would be updated on a Store() (which is what a projection does), you're still good to go. If you're using an Inline projection, the projection has to fetch the current version of the doc anyway. If that doc is IRevisioned, you're still good to go. |
I do like the idea of trying to "know" when the existing aggregate would have to be fetched anyway, then use that to get at the current revision and working normally with that. "The mt_quick_append_events is using loops to perform its updates. I am not 100% sure on the impact for postgres specifically but this definitely fires my mssql "cursors are bad" alarms. It should be possible to redesign this query into a single statement. (see next point)" Sorry that fires off your Spidey Sense about a completely different scenario in a completely different database engine, but I kept that because the alternative was going to require fetching the existing stream version in a round trip otherwise -- which 100% defeats the purpose of the whole quick append thing altogether. Making the DB work a little harder (dispute that w/o hard numbers) is much less than the back and forth chattiness. |
Look at how the upsert functions are built with revisioning turned on. If you pass in a 0 version/revision, that function will auto-increment the mt_version column for you. On fetching an IRevisioned document, you'd get the latest value from the database overwritten by Marten upon fetching,. That was my theory for how to make the quick append work end to end even w/o knowing the version upfront |
Saying this based on observed behaviour. Having a requirement of IRevisioned or [Version] on an inline projection is pretty reasonable I agree, unfortunately you cannot rely on the existing projections revision being equal to the stream's version (due to projection updates being skipped when the projection doesnt handle any events in the current changeset) |
Yeah, I get that my experience is only tangentially applicable here. And 100% agree that more round trips is not the solution. The alternative instead would be to replace the loop with a single statement, similar to the one in the linked gist. Would need a lot of careful measurement to determine if its better or not though. |
Another consideration would be to enforce Revisioning and SingleStream projections only when quick append is enabled. This won't make things faster, but it would make optimisations simpler as they wont have to account for other configurations. |
This is the result of a series of tests I have been doing on the performance of bulk operations with events.
The tests:
THe branch I am running these experiments on is master...elexisvenator:marten:bulk-event-experiments
I am testing the following scenarios:
For all of these tests, there are 2 inline projections - 1 that uses the event inserted, and 1 that doesn't. Additionally multi tenancy is on, the stream identity is a string, and quick append is enabled. With this configuration scenarios 3 and 4 are an exact match for my real-world use cases relating to bulk import/update.
From a benchmarking perspective, these tests are not at all scientific. I am using them to try to identify low hanging fruit for performance improvements, but as the obvious fixes are dealt with these tests will need to be turned into proper benchmarks. For this reason, no specific durations are mentioned below.
The findings
mt_quick_append_events
function will not compile if single-tenant mode is enabled. It doesn't affect these tests but it's something to consider.ISingleStreamSlicer
when it is known that the stream is a new stream. This is done with d2f8442 and is a significant performance boost to scenario 3, bringing its performance almost on par with scenario 1.expectedVersion
is provided). This causes the inline projection's revision to be set to 1Potential improvements:
LoadMany
call.mt_quick_append_events
is using loops to perform its updates. I am not 100% sure on the impact for postgres specifically but this definitely fires my mssql "cursors are bad" alarms. It should be possible to redesign this query into a single statement. (see next point)streams.Where(x => Projection.AppliesTo(x.Events.Select(x => x.EventType)))
seems to me to be very excessive in a scenario where there are hundreds to thousands of streams/events. Maybe the work tracker could precalculate this by keeping track of a list of event types as they are appended, that way a precheck could be done to see if a projection applies to any stream in the set at all. Useful especially when there are a lot of projections that are not relevant to what is being bulk loaded.That's my thoughts from a day satisfying my morbid curiosity around performance
The text was updated successfully, but these errors were encountered: