Skip to content

Conversation

@JoasE
Copy link
Contributor

@JoasE JoasE commented Oct 14, 2025

Cosmos db split transactional batches based on document size and other improvements

Improve performance for non-batching scenario's
Don't use batch for single entry updates
Fixes: #36903

  • I've read the guidelines for contributing and seen the walkthrough
  • I've posted a comment on an issue with a detailed description of how I am planning to contribute and got approval from a member of the team
  • The code builds and tests pass locally (also verified by our automated build checks)
  • Commit messages follow this format:
        Summary of the changes
        - Detail 1
        - Detail 2

        Fixes #bugnumber
  • Tests for the changes have been added (for bug fixes / features)
  • Code follows the same patterns and style as existing code in this repo

JoasE added 2 commits October 14, 2025 14:27
…r improvements

Improve performance for non-batching scenario's
Don't use batch for single entry updates
Fixes: dotnet#36903
// With AutoTransactionBehavior.Always, AddToTransaction will always return true.
if (!AddToTransaction(transaction, updateEntry, stream))
{
yield return transaction;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've currently opted for always sending the transaction with AutoTransactionBehavior.Always, even if we think it is too big. Cosmos db will refuse the request and a DbUpdateException will be thrown. We could also opt for throwing an InvalidOperationException with the SaveChangesAutoTransactionBehaviorAlwaysAtomicity message (with a small change explaining 2MB) if we think the transaction won't be accepted due to it's size. We would need to add an if here to throw that exception

Suggested change
yield return transaction;
if (_currentDbContext.Context.Database.AutoTransactionBehavior == AutoTransactionBehavior.Always)
throw new InvalidOperationException();
yield return transaction;

@JoasE JoasE marked this pull request as ready for review October 14, 2025 14:16
@JoasE JoasE requested a review from a team as a code owner October 14, 2025 14:16
@AndriySvyryd AndriySvyryd requested a review from Copilot October 23, 2025 00:34
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances Cosmos DB transactional batch handling by implementing size-based batch splitting and performance optimizations. The changes address issue #36903 by introducing logic to split batches when they exceed the Cosmos DB 2 MiB request size limit and optimizing non-batching scenarios.

Key Changes:

  • Implements automatic batch splitting based on document size to stay within Cosmos DB's 2 MiB request limit
  • Optimizes single-entry updates by bypassing batch operations when AutoTransactionBehavior is not set to Always
  • Refactors serialization logic to avoid duplication and improve performance

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/EFCore.Specification.Tests/BulkUpdates/NorthwindBulkUpdatesTestBase.cs Corrected assertion to use Assert.Null instead of Assert.Equal(null, ...)
test/EFCore.Cosmos.FunctionalTests/EndToEndCosmosTest.cs Fixed inverted logic for transactional batch configuration
test/EFCore.Cosmos.FunctionalTests/CosmosTransactionalBatchTest.cs Added comprehensive tests for batch splitting, size limits, and single-entry optimization
src/EFCore.Cosmos/Storage/Internal/ICosmosTransactionalBatchWrapper.cs Updated interface to return boolean indicating success and accept streams instead of JToken
src/EFCore.Cosmos/Storage/Internal/ICosmosClientWrapper.cs Updated interface methods to support size checking and renamed batch execution methods for clarity
src/EFCore.Cosmos/Storage/Internal/CosmosTransactionalBatchWrapper.cs Implemented size tracking and batch splitting logic with 2 MiB limit enforcement
src/EFCore.Cosmos/Storage/Internal/CosmosDatabaseWrapper.cs Refactored save logic to handle single-entry optimization and multiple transactions per batch
src/EFCore.Cosmos/Storage/Internal/CosmosClientWrapper.cs Extracted serialization logic to a shared static method and updated method names

@AndriySvyryd AndriySvyryd merged commit 52d11df into dotnet:main Oct 23, 2025
7 checks passed
@AndriySvyryd
Copy link
Member

Thanks for your contribution!

// The tests below will fail if the cosmos db sdk is updated and the serialization logic for transactional batches has changed

[ConditionalTheory, InlineData(true), InlineData(false)]
public virtual async Task SaveChanges_transaction_behaviour_always_single_entity_payload_can_be_exactly_cosmos_limit_and_throws_when_1byte_over(bool oneByteOver)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoasE On a real Cosmos account this test fails consistently for oneByteOver: true, no exceptions are thrown

Copy link
Contributor Author

@JoasE JoasE Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:( That probably means the request body limit for a real cosmos db and the emulator are different

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndriySvyryd In that case I think it's going to be tough to add tests to validate whether internal cosmos sdk serialization logic has changed. We might be ok without that kind of tests tho. Then we only validate whether we don't get an exception if we are at our perceived limit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can create a pr that changes the tests if you want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:( That means the request body limit for a real cosmos db and the emulator are different

It failing for true means the limit is higher on a real cosmos db tho

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can change this one to be +1024 instead of +1 to give us some wiggle room (even if it defeats the main purpose for this test)

Copy link
Contributor Author

@JoasE JoasE Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndriySvyryd Are you sure the test will succeed on a real cosmos db with that extra?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndriySvyryd We could also consider this: #37028

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, though I haven't yet determined the exact value.
Alternatively, we could set the actual limit to what the real Cosmos uses and mark this test with
[CosmosCondition(CosmosCondition.IsNotEmulator)]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the minimum value for real Cosmos is +7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cosmos 11.0 CUD batching RequestEntityTooLarge errors

3 participants