Skip to content

Conversation

@RaduBerinde
Copy link
Member

The way we maintain the metrics that are used for disk usage
calculations are very messy. They are updated incrementally with
special code paths which can be fragile. And they are not consistent
with sibling metrics (for example: Tables.Live.All uses virtual table
sizes, whereas Tables.Live.Local uses backing sizes)

This change redesigns these metrics. We specialize the metrics to
focus on disk usage, and thus only include physical tables and
physical backings. We now use the (improved) disk space aggregator and
new blob size aggregator on demand, only when metrics are needed.

@RaduBerinde RaduBerinde requested a review from jbowens October 28, 2025 21:01
@RaduBerinde RaduBerinde requested a review from a team as a code owner October 28, 2025 21:01
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde RaduBerinde force-pushed the disk-usage-metrics branch 3 times, most recently from b0545bf to e3e228f Compare October 29, 2025 00:05
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

@jbowens reviewed 28 of 28 files at r1, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @RaduBerinde)


metrics.go line 376 at r1 (raw file):

		// ValueSize is the sum of the length of the uncompressed values in all live
		// (referenced by some sstable(s) within the current version) blob files.
		// ValueSize may be greater than ReferencedValueSize when compression is

I think this comment should say "ValueSize may be greater than Live.Total().Size when compression is effective."


metrics/by_placement.go line 77 at r1 (raw file):

}

func (bp *ByPlacement[T]) Set(placement base.Placement, value T) {

nit: worth using Ptr to implement it?

{
    *bp.Ptr(placement) = value
}

metrics/by_placement.go line 187 at r1 (raw file):

		if !first {
			w.Printf("; ")
			first = false

this setting of first should be outside the conditional or in an else branch, right?


metrics/by_placement.go line 194 at r1 (raw file):

		if !first {
			w.Printf("; ")
			first = false

ditto


internal/base/placement.go line 15 at r1 (raw file):

// Placement identifies where a file/object is stored.
//
// The zero value is invalid (this is intentional to determine accidantelly

nit: "accidentally" (and maybe "to disambiguate accidentally uninitialized fields.")


internal/base/placement.go line 17 at r1 (raw file):

// The zero value is invalid (this is intentional to determine accidantelly
// uninitialized fields).
type Placement uint8

love this new enum

The way we maintain the metrics that are used for disk usage
calculations are very messy. They are updated incrementally with
special code paths which can be fragile. And they are not consistent
with sibling metrics (for example: `Tables.Live.All` uses virtual table
sizes, whereas `Tables.Live.Local` uses backing sizes)

This change redesigns these metrics. We specialize the metrics to
focus on disk usage, and thus only include physical tables and
physical backings. We now use the (improved) disk space aggregator and
new blob size aggregator on demand, only when metrics are needed.
@RaduBerinde RaduBerinde merged commit 951fae6 into cockroachdb:master Nov 2, 2025
8 checks passed
@RaduBerinde RaduBerinde deleted the disk-usage-metrics branch November 2, 2025 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants