Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TBS: Replace badger with pebble #15235

Draft
wants to merge 152 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
152 commits
Select commit Hold shift + click to select a range
7478521
WIP
carsonip Dec 16, 2024
29b3a63
Add pebble
carsonip Dec 16, 2024
452c0ca
Use pebble
carsonip Dec 16, 2024
e30ee49
Implement sampling decision
carsonip Dec 16, 2024
3467415
Fix locking, use NoSync
carsonip Dec 16, 2024
886643f
Tune batch flush size, add delete
carsonip Dec 16, 2024
7d2618d
Sharded
carsonip Dec 16, 2024
79e32c0
Increase flush threshold to 5MB
carsonip Dec 16, 2024
75efa89
Merge branch 'main' into tbs-pebble
carsonip Dec 19, 2024
dc5bc9a
Merge branch 'main' into tbs-pebble
carsonip Jan 14, 2025
d2b1080
Replace badger
carsonip Jan 14, 2025
640ee0c
Remove gc loop and drop loop
carsonip Jan 14, 2025
8596317
Fix test
carsonip Jan 14, 2025
3a6f727
Restore storage monitoring
carsonip Jan 14, 2025
fd9abbc
Update processor test
carsonip Jan 14, 2025
896dedb
Explain AlreadyTailSampled failure
carsonip Jan 14, 2025
2c70ad5
Update size estimation
carsonip Jan 14, 2025
cc38dbb
Add FIXME
carsonip Jan 14, 2025
8841d41
Flush by number of pending writes for fair perf comparison
carsonip Jan 14, 2025
adac2a2
Try disable WAL
carsonip Jan 14, 2025
a71fab0
Try in-memory mode
carsonip Jan 14, 2025
15387c8
Revert "Try in-memory mode"
carsonip Jan 14, 2025
d73809d
Revert "Try disable WAL"
carsonip Jan 14, 2025
b294877
Use fork for pebble batch config
carsonip Jan 14, 2025
e2d2cdf
Add db commit threshold bytes
carsonip Jan 14, 2025
60ab9ca
Use BatchOption
carsonip Jan 14, 2025
ef1b422
Add FIXME
carsonip Jan 14, 2025
4b1cbdd
Fix compile error
carsonip Jan 14, 2025
c0e679c
Add logger to pebble
carsonip Jan 14, 2025
c883c35
Sort imports, remove logger.go
carsonip Jan 14, 2025
08c9205
Disable pebble level compression
carsonip Jan 14, 2025
1c89735
Add table bloom filter
carsonip Jan 14, 2025
c35dcb9
Add FIXME
carsonip Jan 14, 2025
ea7b481
Add prefix to sampling decision key to separate from events to improv…
carsonip Jan 15, 2025
5dde3d0
Use a sync.Map for sampling decision to establish baseline perf
carsonip Jan 15, 2025
b8f50f3
Try enabling snappy to trade cpu for reduced disk IO / usage
carsonip Jan 15, 2025
554e93d
Try 16KB block size for better compression
carsonip Jan 15, 2025
e1694a1
Try FormatMajorVersion
carsonip Jan 15, 2025
a1e9010
Revert "Use a sync.Map for sampling decision to establish baseline perf"
carsonip Jan 15, 2025
92b49ec
Use a separate DB for decision
carsonip Jan 20, 2025
2643e50
First attempt to fix test and benchmarks
carsonip Jan 20, 2025
e7ac296
Use configured codec
carsonip Jan 20, 2025
c128e75
Use prefix bloom filter
carsonip Jan 20, 2025
86975b8
Prefix bloom can only be tested when db flushes to disk
carsonip Jan 20, 2025
add0f9e
Add decision db flush
carsonip Jan 20, 2025
f4049af
Set decision db block size to 2KB
carsonip Jan 20, 2025
0c8d662
Change sampling decision flush threshold
carsonip Jan 21, 2025
6ba50f5
Disable decision batch
carsonip Jan 21, 2025
779edce
Disable batch for events
carsonip Jan 21, 2025
01f2276
Remove all batch code
carsonip Jan 21, 2025
b1adee6
Add fixme
carsonip Jan 21, 2025
985951f
Try bypassing sharded rw
carsonip Jan 21, 2025
24d9a9c
Revert "Try bypassing sharded rw"
carsonip Jan 21, 2025
80c6cfe
WIP
carsonip Jan 21, 2025
681d12e
Use partition
carsonip Jan 21, 2025
b7854e1
Implement TTL using partition
carsonip Jan 21, 2025
5236c7c
Compact the DBs
carsonip Jan 21, 2025
44ddc87
Clean up
carsonip Jan 21, 2025
0411dfa
Fix TTL Loop
carsonip Jan 21, 2025
616bc0b
Fix close error
carsonip Jan 21, 2025
e4c62b7
Remove ManagedReadWriter
carsonip Jan 21, 2025
c085876
Fix modulo of negative number
carsonip Jan 21, 2025
3961e81
Rename to prefix read writer
carsonip Jan 21, 2025
9443b73
Split interface
carsonip Jan 21, 2025
f3582cc
Add SplitReadWriter
carsonip Jan 21, 2025
a0556b7
Rename s->sm
carsonip Jan 21, 2025
39fe340
Remove TTL in WriterOps
carsonip Jan 21, 2025
825be6c
Rename
carsonip Jan 21, 2025
a67a315
Remove pointer receiver
carsonip Jan 21, 2025
13b4031
Fix PrefixReadWriter flush
carsonip Jan 21, 2025
740c4f9
make fmt
carsonip Jan 21, 2025
ec1a087
Use upstream pebble, go mod tidy
carsonip Jan 21, 2025
b2bf6b0
make notice
carsonip Jan 21, 2025
1c31a69
Delete storage whitebox test
carsonip Jan 21, 2025
1fef6a7
Rename test
carsonip Jan 21, 2025
19a6e7d
Delete TTL in writerOpts
carsonip Jan 21, 2025
adcfc22
Try bypassing ShardedReadWriter
carsonip Jan 21, 2025
3675eef
Revert "Try bypassing ShardedReadWriter"
carsonip Jan 21, 2025
145d392
Add FIXME
carsonip Jan 21, 2025
cd86c64
Fix ttl loop param
carsonip Jan 21, 2025
2e8e00b
Add FIXME
carsonip Jan 21, 2025
fc833fd
Do not use sharded rw
carsonip Jan 22, 2025
3d1b41b
feat: update to badger v2
kruskall Jan 22, 2025
55e2de4
Reduce memtable size by half
carsonip Jan 23, 2025
ee69d3c
Add FIXME
carsonip Jan 23, 2025
2cc8de4
Add FIXME
carsonip Jan 23, 2025
44cb0f0
Move rw
carsonip Jan 23, 2025
82e102b
Add partitioner
carsonip Jan 23, 2025
9edb4fc
Use partitioner
carsonip Jan 23, 2025
d08369f
Add comments
carsonip Jan 23, 2025
7cf2002
Naming
carsonip Jan 23, 2025
d6a94bd
Add prefix test
carsonip Jan 23, 2025
69fff0d
Remove sharding completely
carsonip Jan 23, 2025
505b8c1
Rename to PartitionReadWriter
carsonip Jan 23, 2025
05a6df4
Add header
carsonip Jan 23, 2025
2c850cf
go mod tidy
carsonip Jan 23, 2025
e646428
Add storage limit handling
carsonip Jan 23, 2025
1b993e1
Stop exposing storage
carsonip Jan 23, 2025
8cf67b5
Remove some badger mentions
carsonip Jan 23, 2025
1e09fa3
Pebble bug
carsonip Jan 23, 2025
9b75264
Flush on sm.Close
carsonip Jan 23, 2025
4c409d3
Remove unused code
carsonip Jan 23, 2025
e99da03
Remove badger mention
carsonip Jan 23, 2025
3abeccf
Remove writerOpts and wrappedRW
carsonip Jan 23, 2025
d39da8a
Add example
carsonip Jan 23, 2025
a41e3c9
Add FIXME
carsonip Jan 23, 2025
e4a0a0f
Clean up comments
carsonip Jan 23, 2025
19bfccb
Add FIXME
carsonip Jan 23, 2025
a1e444c
Remove storage gc interval
carsonip Jan 23, 2025
0b02728
Rename to rotate partitions
carsonip Jan 23, 2025
2fe27f7
Use a stable format
carsonip Jan 23, 2025
917584e
Add log
carsonip Jan 23, 2025
9c0df8c
Improve comment
carsonip Jan 23, 2025
24297af
Improve log
carsonip Jan 23, 2025
80533b3
Add FIXME
carsonip Jan 23, 2025
8c71c8d
No need for merger
carsonip Jan 23, 2025
737f57b
Add FIXME
carsonip Jan 23, 2025
68ed084
Resume from current partition on restart
carsonip Jan 23, 2025
dac6f1a
Use json
carsonip Jan 23, 2025
8400483
Better storage limit reached error message
carsonip Jan 23, 2025
b9725a2
Test StorageLimitReadWriter
carsonip Jan 23, 2025
e411486
Refactor
carsonip Jan 23, 2025
64f0d78
Enable parallel writers
carsonip Jan 23, 2025
d5e10a4
Remove storage limit threshold
carsonip Jan 23, 2025
9ff4008
Change partitioner meta key
carsonip Jan 23, 2025
9285fbc
Improve comments
carsonip Jan 24, 2025
74109a1
Remove StorageDir config
carsonip Jan 24, 2025
9b14a25
Remove errgroup
carsonip Jan 24, 2025
c2d62ed
Change partitioner interface
carsonip Jan 24, 2025
ed25987
Remove writeEntry
carsonip Jan 24, 2025
7074ccf
Add sanity check to total partitions
carsonip Jan 24, 2025
32213d6
Fix partitioner interface
carsonip Jan 24, 2025
25d31ed
Bypass pebble Flush issue for now
carsonip Jan 24, 2025
b990792
Remove bounds when using SeekPrefixGE
carsonip Jan 24, 2025
3bceb43
Remove RW Flush
carsonip Jan 24, 2025
767128f
Use iter.Seq, remove PartitionIterator
carsonip Jan 24, 2025
ce25f01
Remove superfluous err checks
carsonip Jan 24, 2025
7d84395
Add missing err check
carsonip Jan 24, 2025
153db27
Rename lb->b
carsonip Jan 24, 2025
a5fea7b
Add comment to ReadPartitions and WritePartition
carsonip Jan 24, 2025
cbce23d
Fix partitioner interface failing test
carsonip Jan 24, 2025
8f367be
Refactor Size
carsonip Jan 24, 2025
bb8d553
Add comment explaining partition buffer
carsonip Jan 24, 2025
f9a01da
Merge branch 'main' into tbs-pebble-rebase
carsonip Jan 24, 2025
8ba4af0
make fmt update
carsonip Jan 24, 2025
5d97b1a
make update
carsonip Jan 24, 2025
8264339
Workaround pebble Flush hang
carsonip Jan 24, 2025
700ca26
Add missing pebble cleanup in test
carsonip Jan 24, 2025
449da8e
Fix invalid pebble comparer
carsonip Jan 24, 2025
5d1ed19
Change trace ID separator
carsonip Jan 24, 2025
e9fc733
make fmt
carsonip Jan 24, 2025
b83e172
Ensure correctness of DeleteRange
carsonip Jan 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
673 changes: 383 additions & 290 deletions NOTICE.txt

Large diffs are not rendered by default.

10 changes: 4 additions & 6 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ go 1.23.0
require (
github.com/KimMachineGun/automemlimit v0.7.0
github.com/cespare/xxhash/v2 v2.3.0
github.com/dgraph-io/badger/v2 v2.2007.4
github.com/cockroachdb/pebble/v2 v2.0.2
github.com/dustin/go-humanize v1.0.1
github.com/elastic/apm-aggregation v1.2.0
github.com/elastic/apm-data v1.16.0
Expand Down Expand Up @@ -60,20 +60,18 @@ require (
require (
github.com/DataDog/zstd v1.5.6 // indirect
github.com/Microsoft/go-winio v0.6.2 // indirect
github.com/OneOfOne/xxhash v1.2.8 // indirect
github.com/armon/go-radix v1.0.0 // indirect
github.com/axiomhq/hyperloglog v0.2.0 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cespare/xxhash v1.1.0 // indirect
github.com/cockroachdb/crlib v0.0.0-20241015224233-894974b3ad94 // indirect
github.com/cockroachdb/errors v1.11.3 // indirect
github.com/cockroachdb/fifo v0.0.0-20240816210425-c5d0cb0b6fc0 // indirect
github.com/cockroachdb/logtags v0.0.0-20230118201751-21c54148d20b // indirect
github.com/cockroachdb/pebble v1.1.2 // indirect
github.com/cockroachdb/redact v1.1.5 // indirect
github.com/cockroachdb/swiss v0.0.0-20240612210725-f4de07ae6964 // indirect
github.com/cockroachdb/tokenbucket v0.0.0-20230807174530-cc333fc44b06 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/dgraph-io/ristretto v0.2.0 // indirect
github.com/dgryski/go-farm v0.0.0-20200201041132-a6ae2369ad13 // indirect
github.com/dgryski/go-metro v0.0.0-20211217172704-adc40b04c140 // indirect
github.com/dlclark/regexp2 v1.8.1 // indirect
github.com/docker/go-connections v0.5.0 // indirect
Expand All @@ -97,7 +95,7 @@ require (
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/golang/protobuf v1.5.4 // indirect
github.com/golang/snappy v0.0.4 // indirect
github.com/golang/snappy v0.0.5-0.20231225225746-43d5d4cd4e0e // indirect
github.com/gomodule/redigo v1.8.9 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/grpc-ecosystem/go-grpc-middleware v1.3.0 // indirect
Expand Down
65 changes: 16 additions & 49 deletions go.sum

Large diffs are not rendered by default.

2 changes: 0 additions & 2 deletions internal/beater/config/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -362,7 +362,6 @@ func TestUnpackConfig(t *testing.T) {
ESConfig: elasticsearch.DefaultConfig(),
Interval: 1 * time.Minute,
IngestRateDecayFactor: 0.25,
StorageGCInterval: 5 * time.Minute,
carsonip marked this conversation as resolved.
Show resolved Hide resolved
StorageLimit: "3GB",
StorageLimitParsed: 3000000000,
TTL: 30 * time.Minute,
Expand Down Expand Up @@ -494,7 +493,6 @@ func TestUnpackConfig(t *testing.T) {
ESConfig: elasticsearch.DefaultConfig(),
Interval: 2 * time.Minute,
IngestRateDecayFactor: 1.0,
StorageGCInterval: 5 * time.Minute,
StorageLimit: "1GB",
StorageLimitParsed: 1000000000,
TTL: 30 * time.Minute,
Expand Down
2 changes: 0 additions & 2 deletions internal/beater/config/sampling.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ type TailSamplingConfig struct {
ESConfig *elasticsearch.Config `config:"elasticsearch"`
Interval time.Duration `config:"interval" validate:"min=1s"`
IngestRateDecayFactor float64 `config:"ingest_rate_decay" validate:"min=0, max=1"`
StorageGCInterval time.Duration `config:"storage_gc_interval" validate:"min=1s"`
TTL time.Duration `config:"ttl" validate:"min=1s"`
StorageLimit string `config:"storage_limit"`
StorageLimitParsed uint64
Expand Down Expand Up @@ -151,7 +150,6 @@ func defaultTailSamplingConfig() TailSamplingConfig {
ESConfig: elasticsearch.DefaultConfig(),
Interval: 1 * time.Minute,
IngestRateDecayFactor: 0.25,
StorageGCInterval: 5 * time.Minute,
TTL: 30 * time.Minute,
StorageLimit: "3GB",
DiscardOnWriteFailure: false,
Expand Down
2 changes: 1 addition & 1 deletion systemtest/benchtest/expvar/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ func (c *Collector) accumulate(e expvar) {
c.processMetric(HeapAlloc, int64(e.HeapAlloc))
c.processMetric(HeapObjects, int64(e.HeapObjects))
c.processMetric(TBSLsmSize, e.TBSLsmSize)
c.processMetric(TBSVlogSize, e.TBSVlogSize)
c.processMetric(TBSVlogSize, e.TBSVlogSize) // FIXME: update to use new metrics
}

func (c *Collector) processMetric(m Metric, val int64) {
Expand Down
54 changes: 19 additions & 35 deletions x-pack/apm-server/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,9 @@ var (
// will hopefully disappear in the future, when agents no longer send unsampled transactions.
samplingMonitoringRegistry = monitoring.Default.GetRegistry("apm-server.sampling")

// badgerDB holds the badger database to use when tail-based sampling is configured.
badgerMu sync.Mutex
badgerDB *eventstorage.StorageManager

storageMu sync.Mutex
storage *eventstorage.ManagedReadWriter
// db holds the database to use when tail-based sampling is configured.
dbMu sync.Mutex
db *eventstorage.StorageManager

// samplerUUID is a UUID used to identify sampled trace ID documents
// published by this process.
Expand Down Expand Up @@ -117,11 +114,10 @@ func newTailSamplingProcessor(args beater.ServerParams) (*sampling.Processor, er
}

storageDir := paths.Resolve(paths.Data, tailSamplingStorageDir)
badgerDB, err = getBadgerDB(storageDir)
db, err := getDB(storageDir)
if err != nil {
return nil, fmt.Errorf("failed to get Badger database: %w", err)
return nil, fmt.Errorf("failed to get TBS database: %w", err)
}
readWriter := getStorage(badgerDB)

policies := make([]sampling.Policy, len(tailSamplingConfig.Policies))
for i, in := range tailSamplingConfig.Policies {
Expand Down Expand Up @@ -155,37 +151,25 @@ func newTailSamplingProcessor(args beater.ServerParams) (*sampling.Processor, er
UUID: samplerUUID.String(),
},
StorageConfig: sampling.StorageConfig{
DB: badgerDB,
Storage: readWriter,
StorageDir: storageDir,
StorageGCInterval: tailSamplingConfig.StorageGCInterval,
DB: db,
StorageLimit: tailSamplingConfig.StorageLimitParsed,
TTL: tailSamplingConfig.TTL,
DiscardOnWriteFailure: tailSamplingConfig.DiscardOnWriteFailure,
},
})
}

func getBadgerDB(storageDir string) (*eventstorage.StorageManager, error) {
badgerMu.Lock()
defer badgerMu.Unlock()
if badgerDB == nil {
func getDB(storageDir string) (*eventstorage.StorageManager, error) {
dbMu.Lock()
defer dbMu.Unlock()
if db == nil {
sm, err := eventstorage.NewStorageManager(storageDir)
if err != nil {
return nil, err
}
badgerDB = sm
}
return badgerDB, nil
}

func getStorage(sm *eventstorage.StorageManager) *eventstorage.ManagedReadWriter {
storageMu.Lock()
defer storageMu.Unlock()
if storage == nil {
storage = sm.NewReadWriter()
db = sm
}
return storage
return db, nil
}

// runServerWithProcessors runs the APM Server and the given list of processors.
Expand Down Expand Up @@ -249,19 +233,19 @@ func wrapServer(args beater.ServerParams, runServer beater.RunServerFunc) (beate
return args, wrappedRunServer, nil
}

// closeBadger is called at process exit time to close the badger.DB opened
// closeDB is called at process exit time to close the StorageManager opened
// by the tail-based sampling processor constructor, if any. This is never
// called concurrently with opening badger.DB/accessing the badgerDB global,
// so it does not need to hold badgerMu.
func closeBadger() error {
if badgerDB != nil {
return badgerDB.Close()
// called concurrently with opening DB/accessing the db global,
// so it does not need to hold dbMu.
func closeDB() error {
if db != nil {
return db.Close()
}
return nil
}

func cleanup() error {
return closeBadger()
return closeDB()
}

func Main() error {
Expand Down
2 changes: 1 addition & 1 deletion x-pack/apm-server/main_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ func TestMonitoring(t *testing.T) {
home := t.TempDir()
err := paths.InitPaths(&paths.Path{Home: home})
require.NoError(t, err)
defer closeBadger() // close badger.DB so data dir can be deleted on Windows
defer closeDB() // close DB so data dir can be deleted on Windows

cfg := config.DefaultConfig()
cfg.Sampling.Tail.Enabled = true
Expand Down
27 changes: 5 additions & 22 deletions x-pack/apm-server/sampling/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -95,24 +95,16 @@ type DataStreamConfig struct {

// StorageConfig holds Processor configuration related to event storage.
type StorageConfig struct {
// DB holds the badger database in which event storage will be maintained.
// DB holds the StorageManager in which event storage will be maintained.
//
// DB will not be closed when the processor is closed.
DB *eventstorage.StorageManager

// Storage holds the read writers which provide sharded, locked access to storage.
//
// Storage lives outside processor lifecycle and will not be closed when processor
// is closed
Storage rw

// StorageDir holds the directory in which event storage will be maintained.
StorageDir string

// StorageGCInterval holds the amount of time between storage garbage collections.
StorageGCInterval time.Duration
// Storage overrides the default DB storage RW.
// For testing only.
Storage eventstorage.RW

// StorageLimit for the badger database, in bytes.
// StorageLimit for the TBS database, in bytes.
StorageLimit uint64

// TTL holds the amount of time before events and sampling decisions
Expand Down Expand Up @@ -242,15 +234,6 @@ func (config StorageConfig) validate() error {
if config.DB == nil {
return errors.New("DB unspecified")
}
if config.Storage == nil {
return errors.New("Storage unspecified")
}
if config.StorageDir == "" {
return errors.New("StorageDir unspecified")
}
if config.StorageGCInterval <= 0 {
return errors.New("StorageGCInterval unspecified or negative")
}
if config.TTL <= 0 {
return errors.New("TTL unspecified or negative")
}
Expand Down
9 changes: 0 additions & 9 deletions x-pack/apm-server/sampling/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -73,15 +73,6 @@ func TestNewProcessorConfigInvalid(t *testing.T) {
assertInvalidConfigError("invalid storage config: DB unspecified")
config.DB = &eventstorage.StorageManager{}

assertInvalidConfigError("invalid storage config: Storage unspecified")
config.Storage = &eventstorage.ManagedReadWriter{}

assertInvalidConfigError("invalid storage config: StorageDir unspecified")
config.StorageDir = "tbs"

assertInvalidConfigError("invalid storage config: StorageGCInterval unspecified or negative")
config.StorageGCInterval = 1

assertInvalidConfigError("invalid storage config: TTL unspecified or negative")
config.TTL = 1
}
45 changes: 0 additions & 45 deletions x-pack/apm-server/sampling/eventstorage/badger.go

This file was deleted.

50 changes: 0 additions & 50 deletions x-pack/apm-server/sampling/eventstorage/logger.go

This file was deleted.

Loading
Loading