Skip to content

Conversation

@nmk70
Copy link
Contributor

@nmk70 nmk70 commented Nov 17, 2025

Summary:
Enable target_file_size_is_upper_bound option by default to enforce stricter file size limits when deciding whether to cut a compaction output file. This prevents output SST files from exceeding target_file_size_base. When false, SST files might exceed the target_file_size_base.

Changes:

  • Enable target_file_size_is_upper_bound by default
  • Add Java bindings for target_file_size_is_upper_bound option
  • Bug Fix: Prevent premature file cutting when applying tail size estimation, to ensure output SST files contain meaningful data. This ensures RocksDB does not create empty SST files triggered by the tail size estimation.

Test Plan:
Existing tests.

Summary:
This change the default value of target_file_size_is_upper_bound from false to true.
This option enforces that SST files strictly respect the `target_file_size_base` limit
as an upper bound (hard limit) rather than a soft target.

Test Plan:
Existing tests. Updated multiple tests to explicitly disable the option where the old behavior
is needed for testing specific behavior

Reviewers:

Subscribers:

Tasks:

Tags:
@meta-cla meta-cla bot added the CLA Signed label Nov 17, 2025
`num_data_blocks` is only written by the emit thread, never by worker threads, and read by emit thread when deciding when to cut an output file during compaction
@meta-codesync
Copy link

meta-codesync bot commented Nov 19, 2025

@nmk70 has imported this pull request. If you are a Meta employee, you can view this in D87391058.

@nmk70 nmk70 marked this pull request as ready for review November 19, 2025 00:33
@nmk70 nmk70 requested review from cbi42 and pdillinger November 19, 2025 00:33
@nmk70
Copy link
Contributor Author

nmk70 commented Nov 19, 2025

Crash tests are failing due to an assertion failure in BlockBasedTableBuilder::Finish that checks the estimated file size is greater than the actual file size. This assertion exists to validate target_file_size_is_upper_bound option strictly enforces the target_file_size limit for compaction files. Due to this, some crash tests fail because the estimated file size is underestimated. I plan to improve the accuracy of the tail size estimate in a follow-up PR, as this issue does not affect correctness.

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for more production validation first. Also, we need to keep the crash test in a relatively clean state best we can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants