diff --git a/src/current/v26.1/architecture/storage-layer.md b/src/current/v26.1/architecture/storage-layer.md index 9fe05b3e55c..d70787691b6 100644 --- a/src/current/v26.1/architecture/storage-layer.md +++ b/src/current/v26.1/architecture/storage-layer.md @@ -67,6 +67,18 @@ SSTs are an on-disk representation of sorted lists of key-value pairs. Conceptua SST files are immutable; they are never modified, even during the [compaction process](#compaction). +##### SST compression + +Pebble compresses SSTable and blob value data to reduce physical storage use. The default profile, `fastest`, uses MinLZ1 on `amd64` and `arm64` platforms, is optimized for low CPU overhead, and is appropriate for most workloads. + +For advanced storage tuning, CockroachDB exposes the `storage.sstable.compression_algorithm` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}). The profile values are ordered by increasing compression effort: `fastest`, `fast`, `balanced`, and `good`. These profiles enable selective use of Zstd1 depending on the block type, LSM level, and compression benefit. Higher-effort profiles use Zstd1 more frequently and can improve compression for some workloads, but can also increase CPU usage for writes, compactions, and reads that decompress data. Most users do not need to tune this setting. Work with [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing this setting in production. + +Changing `storage.sstable.compression_algorithm` does not immediately recompress existing SST files. SSTs are immutable, so a new setting applies as Pebble writes new SSTs or rewrites existing SSTs during compaction, ingestion, restore, or other SST-writing work. During a transition, a store can contain SSTs compressed with multiple algorithms. + +To evaluate the CPU usage vs. size tradeoff for your particular data, use [`cockroach debug pebble db analyze-data`]({% link {{ page.version.version }}/cockroach-debug-pebble-db-analyze-data.md %}). Review the results with [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing the cluster setting. + +You can monitor overall storage compression with the `storage.compression.cr` metric. + ##### LSM levels The levels of the LSM are organized from L0 to L6. L0 is the top-most level. L6 is the bottom-most level. New data is added into L0 (e.g., using [`INSERT`]({% link {{ page.version.version }}/insert.md %}) or [`IMPORT INTO`]({% link {{ page.version.version }}/import-into.md %})) and then merged down into lower levels over time. diff --git a/src/current/v26.1/cockroach-debug-pebble-db-analyze-data.md b/src/current/v26.1/cockroach-debug-pebble-db-analyze-data.md index b7dbc59a9bb..004bc247b29 100644 --- a/src/current/v26.1/cockroach-debug-pebble-db-analyze-data.md +++ b/src/current/v26.1/cockroach-debug-pebble-db-analyze-data.md @@ -55,6 +55,17 @@ This data can help Cockroach Labs evaluate compression defaults and can help you The output CSV file is periodically rewritten while the command is running. Even if the command is interrupted, you can still use the most recently written output. +### Review results with Cockroach Labs + +The command groups sampled blocks into buckets and reports compression results for each bucket. + +- `CR` is the compression ratio. Higher values mean that compressed data uses less storage. +- `Comp` is compression throughput. Lower throughput means higher CPU cost when Pebble writes or rewrites SSTs. +- `Decomp` is decompression throughput. Lower throughput means higher CPU cost when CockroachDB reads compressed data. +- `Snappy`, `MinLZ1`, `Zstd1`, `Auto1/30`, `Auto1/15`, and `Zstd3` are compression experiments. These names are not a one-to-one list of valid values for [`storage.sstable.compression_algorithm`]({% link {{ page.version.version }}/cluster-settings.md %}); use them to compare the general speed and space tradeoffs of lower-effort, adaptive, and higher-effort compression. + +Most users do not need to tune SSTable compression settings. These results are intended to support review with Cockroach Labs rather than direct tuning decisions by end users. To understand whether compression tradeoffs are relevant for a representative store, consult [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing SSTable compression settings in production. For more information about Pebble SSTable compression, see [SST compression]({% link {{ page.version.version }}/architecture/storage-layer.md %}#sst-compression). + ## Examples ### View help output @@ -155,4 +166,5 @@ Sampled 4 files, 2.4MB (100.00%) ## See also - [`cockroach` Commands Overview]({% link {{ page.version.version }}/cockroach-commands.md %}) +- [Storage Layer: SST compression]({% link {{ page.version.version }}/architecture/storage-layer.md %}#sst-compression) - [Troubleshooting Overview]({% link {{ page.version.version }}/troubleshooting-overview.md %}) diff --git a/src/current/v26.2/architecture/storage-layer.md b/src/current/v26.2/architecture/storage-layer.md index 914152ea1d2..59a04408276 100644 --- a/src/current/v26.2/architecture/storage-layer.md +++ b/src/current/v26.2/architecture/storage-layer.md @@ -67,6 +67,18 @@ SSTs are an on-disk representation of sorted lists of key-value pairs. Conceptua SST files are immutable; they are never modified, even during the [compaction process](#compaction). +##### SST compression + +Pebble compresses SSTable and blob value data to reduce physical storage use. The default profile, `fastest`, uses MinLZ1 on `amd64` and `arm64` platforms, is optimized for low CPU overhead, and is appropriate for most workloads. + +For advanced storage tuning, CockroachDB exposes the `storage.sstable.compression_algorithm` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}). The profile values are ordered by increasing compression effort: `fastest`, `fast`, `balanced`, and `good`. These profiles enable selective use of Zstd1 depending on the block type, LSM level, and compression benefit. Higher-effort profiles use Zstd1 more frequently and can improve compression for some workloads, but can also increase CPU usage for writes, compactions, and reads that decompress data. Most users do not need to tune this setting. Work with [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing this setting in production. + +Changing `storage.sstable.compression_algorithm` does not immediately recompress existing SST files. SSTs are immutable, so a new setting applies as Pebble writes new SSTs or rewrites existing SSTs during compaction, ingestion, restore, or other SST-writing work. During a transition, a store can contain SSTs compressed with multiple algorithms. + +To evaluate the CPU usage vs. size tradeoff for your particular data, use [`cockroach debug pebble db analyze-data`]({% link {{ page.version.version }}/cockroach-debug-pebble-db-analyze-data.md %}). Review the results with [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing the cluster setting. + +You can monitor overall storage compression with the `storage.compression.cr` metric. + ##### LSM levels The levels of the LSM are organized from L0 to L6. L0 is the top-most level. L6 is the bottom-most level. New data is added into L0 (e.g., using [`INSERT`]({% link {{ page.version.version }}/insert.md %}) or [`IMPORT INTO`]({% link {{ page.version.version }}/import-into.md %})) and then merged down into lower levels over time. diff --git a/src/current/v26.2/cockroach-debug-pebble-db-analyze-data.md b/src/current/v26.2/cockroach-debug-pebble-db-analyze-data.md index b7dbc59a9bb..004bc247b29 100644 --- a/src/current/v26.2/cockroach-debug-pebble-db-analyze-data.md +++ b/src/current/v26.2/cockroach-debug-pebble-db-analyze-data.md @@ -55,6 +55,17 @@ This data can help Cockroach Labs evaluate compression defaults and can help you The output CSV file is periodically rewritten while the command is running. Even if the command is interrupted, you can still use the most recently written output. +### Review results with Cockroach Labs + +The command groups sampled blocks into buckets and reports compression results for each bucket. + +- `CR` is the compression ratio. Higher values mean that compressed data uses less storage. +- `Comp` is compression throughput. Lower throughput means higher CPU cost when Pebble writes or rewrites SSTs. +- `Decomp` is decompression throughput. Lower throughput means higher CPU cost when CockroachDB reads compressed data. +- `Snappy`, `MinLZ1`, `Zstd1`, `Auto1/30`, `Auto1/15`, and `Zstd3` are compression experiments. These names are not a one-to-one list of valid values for [`storage.sstable.compression_algorithm`]({% link {{ page.version.version }}/cluster-settings.md %}); use them to compare the general speed and space tradeoffs of lower-effort, adaptive, and higher-effort compression. + +Most users do not need to tune SSTable compression settings. These results are intended to support review with Cockroach Labs rather than direct tuning decisions by end users. To understand whether compression tradeoffs are relevant for a representative store, consult [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing SSTable compression settings in production. For more information about Pebble SSTable compression, see [SST compression]({% link {{ page.version.version }}/architecture/storage-layer.md %}#sst-compression). + ## Examples ### View help output @@ -155,4 +166,5 @@ Sampled 4 files, 2.4MB (100.00%) ## See also - [`cockroach` Commands Overview]({% link {{ page.version.version }}/cockroach-commands.md %}) +- [Storage Layer: SST compression]({% link {{ page.version.version }}/architecture/storage-layer.md %}#sst-compression) - [Troubleshooting Overview]({% link {{ page.version.version }}/troubleshooting-overview.md %})