Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions src/current/v26.1/architecture/storage-layer.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,18 @@ SSTs are an on-disk representation of sorted lists of key-value pairs. Conceptua

SST files are immutable; they are never modified, even during the [compaction process](#compaction).

##### SST compression

Pebble compresses SSTable and blob value data to reduce physical storage use. The default profile, `fastest`, is optimized for low CPU overhead and is appropriate for most workloads.
Comment thread
rmloveland marked this conversation as resolved.
Outdated

For advanced storage tuning, CockroachDB exposes the `storage.sstable.compression_algorithm` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}). The profile values are ordered by increasing compression effort: `fastest`, `fast`, `balanced`, and `good`. Higher-effort profiles can improve compression for some workloads, but can also increase CPU usage for writes, compactions, and reads that decompress data. Most users do not need to tune this setting. Work with [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing this setting in production.
Comment thread
rmloveland marked this conversation as resolved.
Outdated

Changing `storage.sstable.compression_algorithm` does not immediately recompress existing SST files. SSTs are immutable, so a new setting applies as Pebble writes new SSTs or rewrites existing SSTs during compaction, ingestion, restore, or other SST-writing work. During a transition, a store can contain SSTs compressed with multiple algorithms.

To evaluate the compression behavior of an existing store or backup, use [`cockroach debug pebble db analyze-data`]({% link {{ page.version.version }}/cockroach-debug-pebble-db-analyze-data.md %}). Compare the compression ratio with the compression and decompression throughput for representative data before changing the cluster setting.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "To evaluate the CPU usage vs size tradeoff on your particular data, use .."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated! based on your other comment about not encouraging users to change this themselves, i also updated the second sentence to clarify that they should review the results with us before changing the cluster setting

To evaluate the CPU usage vs. size tradeoff for your particular data, use [cockroach debug pebble db analyze-data]({% link {{ page.version.version }}/cockroach-debug-pebble-db-analyze-data.md %}). Review the results with Cockroach
Labs Support
before changing the cluster setting


You can monitor overall storage compression with the `storage.compression.cr` metric.

##### LSM levels

The levels of the LSM are organized from L0 to L6. L0 is the top-most level. L6 is the bottom-most level. New data is added into L0 (e.g., using [`INSERT`]({% link {{ page.version.version }}/insert.md %}) or [`IMPORT INTO`]({% link {{ page.version.version }}/import-into.md %})) and then merged down into lower levels over time.
Expand Down
12 changes: 12 additions & 0 deletions src/current/v26.1/cockroach-debug-pebble-db-analyze-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,17 @@ This data can help Cockroach Labs evaluate compression defaults and can help you

The output CSV file is periodically rewritten while the command is running. Even if the command is interrupted, you can still use the most recently written output.

### Interpret results
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should encourage customers to interpret these results in their current form. We should say that they should just consult us with the data.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have revised this section a bit with a new title, and clarified that the output is not really designed for end users, and they should contact Support - hope it's clearer now, please let me know!


The command groups sampled blocks into buckets and reports compression results for each bucket. Use the results to compare algorithms for the data that is actually stored in the analyzed store or backup.

- `CR` is the compression ratio. Higher values mean that compressed data uses less storage.
- `Comp` is compression throughput. Lower throughput means higher CPU cost when Pebble writes or rewrites SSTs.
- `Decomp` is decompression throughput. Lower throughput means higher CPU cost when CockroachDB reads compressed data.
- `Snappy`, `MinLZ1`, `Zstd1`, `Auto1/30`, `Auto1/15`, and `Zstd3` are compression experiments. These names are not a one-to-one list of valid values for [`storage.sstable.compression_algorithm`]({% link {{ page.version.version }}/cluster-settings.md %}); use them to compare the general speed and space tradeoffs of lower-effort, adaptive, and higher-effort compression.

Most users do not need to tune SSTable compression settings. Use these results to understand whether compression tradeoffs are relevant for a representative store, and consult [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing SSTable compression settings in production. For more information about Pebble SSTable compression, see [SST compression]({% link {{ page.version.version }}/architecture/storage-layer.md %}#sst-compression).

## Examples

### View help output
Expand Down Expand Up @@ -155,4 +166,5 @@ Sampled 4 files, 2.4MB (100.00%)
## See also

- [`cockroach` Commands Overview]({% link {{ page.version.version }}/cockroach-commands.md %})
- [Storage Layer: SST compression]({% link {{ page.version.version }}/architecture/storage-layer.md %}#sst-compression)
- [Troubleshooting Overview]({% link {{ page.version.version }}/troubleshooting-overview.md %})
12 changes: 12 additions & 0 deletions src/current/v26.2/architecture/storage-layer.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,18 @@ SSTs are an on-disk representation of sorted lists of key-value pairs. Conceptua

SST files are immutable; they are never modified, even during the [compaction process](#compaction).

##### SST compression

Pebble compresses SSTable and blob value data to reduce physical storage use. The default profile, `fastest`, is optimized for low CPU overhead and is appropriate for most workloads.

For advanced storage tuning, CockroachDB exposes the `storage.sstable.compression_algorithm` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}). The profile values are ordered by increasing compression effort: `fastest`, `fast`, `balanced`, and `good`. Higher-effort profiles can improve compression for some workloads, but can also increase CPU usage for writes, compactions, and reads that decompress data. Most users do not need to tune this setting. Work with [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing this setting in production.

Changing `storage.sstable.compression_algorithm` does not immediately recompress existing SST files. SSTs are immutable, so a new setting applies as Pebble writes new SSTs or rewrites existing SSTs during compaction, ingestion, restore, or other SST-writing work. During a transition, a store can contain SSTs compressed with multiple algorithms.

To evaluate the compression behavior of an existing store or backup, use [`cockroach debug pebble db analyze-data`]({% link {{ page.version.version }}/cockroach-debug-pebble-db-analyze-data.md %}). Compare the compression ratio with the compression and decompression throughput for representative data before changing the cluster setting.

You can monitor overall storage compression with the `storage.compression.cr` metric.

##### LSM levels

The levels of the LSM are organized from L0 to L6. L0 is the top-most level. L6 is the bottom-most level. New data is added into L0 (e.g., using [`INSERT`]({% link {{ page.version.version }}/insert.md %}) or [`IMPORT INTO`]({% link {{ page.version.version }}/import-into.md %})) and then merged down into lower levels over time.
Expand Down
12 changes: 12 additions & 0 deletions src/current/v26.2/cockroach-debug-pebble-db-analyze-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,17 @@ This data can help Cockroach Labs evaluate compression defaults and can help you

The output CSV file is periodically rewritten while the command is running. Even if the command is interrupted, you can still use the most recently written output.

### Interpret results

The command groups sampled blocks into buckets and reports compression results for each bucket. Use the results to compare algorithms for the data that is actually stored in the analyzed store or backup.

- `CR` is the compression ratio. Higher values mean that compressed data uses less storage.
- `Comp` is compression throughput. Lower throughput means higher CPU cost when Pebble writes or rewrites SSTs.
- `Decomp` is decompression throughput. Lower throughput means higher CPU cost when CockroachDB reads compressed data.
- `Snappy`, `MinLZ1`, `Zstd1`, `Auto1/30`, `Auto1/15`, and `Zstd3` are compression experiments. These names are not a one-to-one list of valid values for [`storage.sstable.compression_algorithm`]({% link {{ page.version.version }}/cluster-settings.md %}); use them to compare the general speed and space tradeoffs of lower-effort, adaptive, and higher-effort compression.

Most users do not need to tune SSTable compression settings. Use these results to understand whether compression tradeoffs are relevant for a representative store, and consult [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing SSTable compression settings in production. For more information about Pebble SSTable compression, see [SST compression]({% link {{ page.version.version }}/architecture/storage-layer.md %}#sst-compression).

## Examples

### View help output
Expand Down Expand Up @@ -155,4 +166,5 @@ Sampled 4 files, 2.4MB (100.00%)
## See also

- [`cockroach` Commands Overview]({% link {{ page.version.version }}/cockroach-commands.md %})
- [Storage Layer: SST compression]({% link {{ page.version.version }}/architecture/storage-layer.md %}#sst-compression)
- [Troubleshooting Overview]({% link {{ page.version.version }}/troubleshooting-overview.md %})
Loading