-
Couldn't load subscription status.
- Fork 3.3k
Cosmos db split transactional batches based on document size and other improvements #36964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…r improvements Improve performance for non-batching scenario's Don't use batch for single entry updates Fixes: dotnet#36903
| // With AutoTransactionBehavior.Always, AddToTransaction will always return true. | ||
| if (!AddToTransaction(transaction, updateEntry, stream)) | ||
| { | ||
| yield return transaction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've currently opted for always sending the transaction with AutoTransactionBehavior.Always, even if we think it is too big. Cosmos db will refuse the request and a DbUpdateException will be thrown. We could also opt for throwing an InvalidOperationException with the SaveChangesAutoTransactionBehaviorAlwaysAtomicity message (with a small change explaining 2MB) if we think the transaction won't be accepted due to it's size. We would need to add an if here to throw that exception
| yield return transaction; | |
| if (_currentDbContext.Context.Database.AutoTransactionBehavior == AutoTransactionBehavior.Always) | |
| throw new InvalidOperationException(); | |
| yield return transaction; |
src/EFCore.Cosmos/Storage/Internal/CosmosTransactionalBatchWrapper.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances Cosmos DB transactional batch handling by implementing size-based batch splitting and performance optimizations. The changes address issue #36903 by introducing logic to split batches when they exceed the Cosmos DB 2 MiB request size limit and optimizing non-batching scenarios.
Key Changes:
- Implements automatic batch splitting based on document size to stay within Cosmos DB's 2 MiB request limit
- Optimizes single-entry updates by bypassing batch operations when
AutoTransactionBehavioris not set toAlways - Refactors serialization logic to avoid duplication and improve performance
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
test/EFCore.Specification.Tests/BulkUpdates/NorthwindBulkUpdatesTestBase.cs |
Corrected assertion to use Assert.Null instead of Assert.Equal(null, ...) |
test/EFCore.Cosmos.FunctionalTests/EndToEndCosmosTest.cs |
Fixed inverted logic for transactional batch configuration |
test/EFCore.Cosmos.FunctionalTests/CosmosTransactionalBatchTest.cs |
Added comprehensive tests for batch splitting, size limits, and single-entry optimization |
src/EFCore.Cosmos/Storage/Internal/ICosmosTransactionalBatchWrapper.cs |
Updated interface to return boolean indicating success and accept streams instead of JToken |
src/EFCore.Cosmos/Storage/Internal/ICosmosClientWrapper.cs |
Updated interface methods to support size checking and renamed batch execution methods for clarity |
src/EFCore.Cosmos/Storage/Internal/CosmosTransactionalBatchWrapper.cs |
Implemented size tracking and batch splitting logic with 2 MiB limit enforcement |
src/EFCore.Cosmos/Storage/Internal/CosmosDatabaseWrapper.cs |
Refactored save logic to handle single-entry optimization and multiple transactions per batch |
src/EFCore.Cosmos/Storage/Internal/CosmosClientWrapper.cs |
Extracted serialization logic to a shared static method and updated method names |
src/EFCore.Cosmos/Storage/Internal/CosmosTransactionalBatchWrapper.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Andriy Svyryd <[email protected]>
|
Thanks for your contribution! |
| // The tests below will fail if the cosmos db sdk is updated and the serialization logic for transactional batches has changed | ||
|
|
||
| [ConditionalTheory, InlineData(true), InlineData(false)] | ||
| public virtual async Task SaveChanges_transaction_behaviour_always_single_entity_payload_can_be_exactly_cosmos_limit_and_throws_when_1byte_over(bool oneByteOver) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoasE On a real Cosmos account this test fails consistently for oneByteOver: true, no exceptions are thrown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:( That probably means the request body limit for a real cosmos db and the emulator are different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndriySvyryd In that case I think it's going to be tough to add tests to validate whether internal cosmos sdk serialization logic has changed. We might be ok without that kind of tests tho. Then we only validate whether we don't get an exception if we are at our perceived limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can create a pr that changes the tests if you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:( That means the request body limit for a real cosmos db and the emulator are different
It failing for true means the limit is higher on a real cosmos db tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can change this one to be +1024 instead of +1 to give us some wiggle room (even if it defeats the main purpose for this test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndriySvyryd Are you sure the test will succeed on a real cosmos db with that extra?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndriySvyryd We could also consider this: #37028
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, though I haven't yet determined the exact value.
Alternatively, we could set the actual limit to what the real Cosmos uses and mark this test with
[CosmosCondition(CosmosCondition.IsNotEmulator)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the minimum value for real Cosmos is +7
Cosmos db split transactional batches based on document size and other improvements
Improve performance for non-batching scenario's
Don't use batch for single entry updates
Fixes: #36903