Skip to content

Commit

Permalink
TBS: change default disk usage threshold to 0.8 (#15524)
Browse files Browse the repository at this point in the history
Change TBS default usage threshold to 0.8 for a lower risk. Change log to be slightly more readable.
  • Loading branch information
carsonip authored Feb 3, 2025
1 parent 8e48d58 commit 9ccfae6
Show file tree
Hide file tree
Showing 5 changed files with 7 additions and 6 deletions.
2 changes: 1 addition & 1 deletion changelogs/9.0.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ https://github.com/elastic/apm-server/compare/v\...v9.0.0[View commits]

[float]
==== Breaking Changes
- Change `sampling.tail.storage_limit` default to `0`. While `0` means unlimited local tail-sampling database size, it now enforces a max 90% disk usage on the disk where the data directory is located. Any tail sampling writes after this threshold will be rejected, similar to what happens when tail-sampling database size exceeds a non-0 storage limit. Setting `sampling.tail.storage_limit` to non-0 maintains the existing behavior which limits the tail-sampling database size to `sampling.tail.storage_limit` and does not have the new disk usage threshold check. {pull}15467[15467]
- Change `sampling.tail.storage_limit` default to `0`. While `0` means unlimited local tail-sampling database size, it now enforces a max 80% disk usage on the disk where the data directory is located. Any tail sampling writes after this threshold will be rejected, similar to what happens when tail-sampling database size exceeds a non-0 storage limit. Setting `sampling.tail.storage_limit` to non-0 maintains the existing behavior which limits the tail-sampling database size to `sampling.tail.storage_limit` and does not have the new disk usage threshold check. {pull}15467[15467] {pull}15524[15524]

[float]
==== Deprecations
Expand Down
5 changes: 3 additions & 2 deletions changelogs/all-breaking-changes.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,14 @@ The following breaking changes are introduced in APM version 9.0.0:

- Change `sampling.tail.storage_limit` default to `0`.
While `0` means unlimited local tail-sampling database size,
it now enforces a max 90% disk usage on the disk where the data directory is located.
it now enforces a max 80% disk usage on the disk where the data directory is located.
Any tail sampling writes after this threshold will be rejected,
similar to what happens when tail-sampling database size exceeds a non-0 storage limit.
Setting `sampling.tail.storage_limit` to non-0 maintains the existing behavior
which limits the tail-sampling database size to `sampling.tail.storage_limit`
and does not have the new disk usage threshold check.
For more details, see https://github.com/elastic/apm-server/pull/15467[PR #15467]
For more details, see https://github.com/elastic/apm-server/pull/15467[PR #15467] and
https://github.com/elastic/apm-server/pull/15524[PR #15524]
// end::90-bc[]

// tag::811-bc[]
Expand Down
2 changes: 1 addition & 1 deletion internal/beater/config/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -364,7 +364,7 @@ func TestUnpackConfig(t *testing.T) {
IngestRateDecayFactor: 0.25,
StorageLimit: "0",
StorageLimitParsed: 0,
DiskUsageThreshold: 0.9,
DiskUsageThreshold: 0.8,
TTL: 30 * time.Minute,
},
},
Expand Down
2 changes: 1 addition & 1 deletion internal/beater/config/sampling.go
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ func defaultTailSamplingConfig() TailSamplingConfig {
IngestRateDecayFactor: 0.25,
TTL: 30 * time.Minute,
StorageLimit: "0",
DiskUsageThreshold: 0.9,
DiskUsageThreshold: 0.8,
DiscardOnWriteFailure: false,
}
parsed, err := humanize.ParseBytes(cfg.StorageLimit)
Expand Down
2 changes: 1 addition & 1 deletion x-pack/apm-server/sampling/eventstorage/storage_manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -449,7 +449,7 @@ func (sm *StorageManager) NewReadWriter(storageLimit uint64, diskUsageThreshold
return uint64(float64(sm.cachedDiskStat.total.Load()) * diskUsageThreshold)
}
// the total disk space could change in runtime, but it is still useful to print it out in logs.
sm.logger.Infof("setting disk usage threshold to %.2f of total disk space of %0.1fgb", diskUsageThreshold, float64(sm.cachedDiskStat.total.Load())/gb)
sm.logger.Infof("setting disk usage threshold to %.0f%% of total disk space of %0.1fgb", diskUsageThreshold*100, float64(sm.cachedDiskStat.total.Load())/gb)
diskThresholdChecker := NewStorageLimitCheckerFunc(sm.diskUsed, diskThreshold)
rw = NewStorageLimitReadWriter(
fmt.Sprintf("disk usage threshold %.2f", diskUsageThreshold),
Expand Down

0 comments on commit 9ccfae6

Please sign in to comment.