Skip to content

Conversation

aptend
Copy link
Contributor

@aptend aptend commented Sep 19, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #22524

What this PR does / why we need it:

  • Add commit timestamp as the last column in all objects to facilitate data change backtracing.
  • When reading a legacy object without a commit timestamp, a ConstNull vector will be returned.
  • The usage example pkg/vm/engine/tae/blockio/read.go:594

PR Type

Enhancement


Description

  • Add commit timestamp column to all objects for data change backtracing

  • Handle legacy objects without commit timestamps by returning ConstNull vectors

  • Refactor column reading logic to support special commit timestamp column

  • Update merge and flush operations to include commit timestamps


Diagram Walkthrough

flowchart LR
  A["Legacy Objects"] --> B["Read Operation"]
  C["New Objects"] --> B
  B --> D["Check Commit TS Column"]
  D --> E["Has TS Column?"]
  E -->|Yes| F["Read Actual TS"]
  E -->|No| G["Return ConstNull Vector"]
  F --> H["Apply Visibility Filter"]
  G --> I["Use Creation TS as Fallback"]
  H --> J["Final Result"]
  I --> J
Loading

File Walkthrough

Relevant files
Enhancement
txnts.go
Add String method to TS type                                                         

pkg/container/types/txnts.go

  • Add String() method to TS type for formatting interface compatibility
+5/-0     
funcs.go
Enhance column reading with commit timestamp support         

pkg/objectio/funcs.go

  • Refactor column reading logic with putFillHolder helper function
  • Add special handling for commit timestamp column (SEQNUM_COMMITTS)
  • Check if last column is timestamp type before reading
  • Handle legacy objects without commit timestamp columns
+27/-16 
tool.go
Handle commit timestamp in object data retrieval                 

pkg/vm/engine/tae/rpc/tool.go

  • Add logic to handle commit timestamp column in object data retrieval
  • Map timestamp column to SEQNUM_COMMITTS when it's the last column
+5/-2     
flushTableTail.go
Always include commit timestamp in flush operations           

pkg/vm/engine/tae/tables/jobs/flushTableTail.go

  • Always append commit timestamp column to read operations
  • Remove tombstone-specific condition for commit timestamp reading
+2/-4     
mergeobjects.go
Always include commit timestamp in merge operations           

pkg/vm/engine/tae/tables/jobs/mergeobjects.go

  • Always include commit timestamp column in merge operations
  • Remove tombstone-specific conditions for commit timestamp handling
  • Update both task initialization and writer preparation
+3/-7     
pnode.go
Handle legacy objects with commit timestamp fallback         

pkg/vm/engine/tae/tables/pnode.go

  • Add replaceCommitts function to handle legacy objects without
    timestamps
  • Use object creation timestamp as fallback for ConstNull vectors
  • Add logging for commit timestamp replacement operations
  • Refactor commit timestamp handling in scan operations
+21/-10 
utils.go
Enhance column loading with commit timestamp tracking       

pkg/vm/engine/tae/tables/utils.go

  • Add commit timestamp index tracking in column loading
  • Handle appendable objects with timestamp-based visibility filtering
  • Improve logic for managing commit timestamp columns in vector
    operations
  • Add proper cleanup for temporary commit timestamp vectors
+19/-8   

Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

In the special-column handling, putFillHolder stores the resolved seqnum (which may be metaColCnt-1) into the placeholder Size marker. Downstream code likely relies on original requested seqnum semantics; verify consumers correctly interpret this marker and that overloading Size with a seqnum cannot collide with legitimate zero-length reads or cause misalignment when multiple placeholders exist.

var filledEntries []fileservice.IOEntry
putFillHolder := func(i int, seqnum uint16) {
	if filledEntries == nil {
		filledEntries = make([]fileservice.IOEntry, len(seqnums))
	}
	filledEntries[i] = fileservice.IOEntry{
		Size: int64(seqnum), // a marker, it can not be zero
	}
}

blkmeta := meta.GetBlockMeta(uint32(blk))
maxSeqnum := blkmeta.GetMaxSeqnum()
for i, seqnum := range seqnums {
	// special columns
	if seqnum >= SEQNUM_UPPER {
		metaColCnt := blkmeta.GetMetaColumnCount()
		switch seqnum {
		case SEQNUM_COMMITTS:
			seqnum = metaColCnt - 1
		case SEQNUM_ABORT:
			panic("not support")
		default:
			panic(fmt.Sprintf("bad path to read special column %d", seqnum))
		}
		// if the last column is not commits, do not read it
		//  1. created by cn
		//  2. old version tn nonappendable block
		col := blkmeta.ColumnMeta(seqnum)
		if col.DataType() != uint8(types.T_TS) {
			putFillHolder(i, seqnum)
		} else {
			ext := col.Location()
			ioVec.Entries = append(ioVec.Entries, fileservice.IOEntry{
				Offset:      int64(ext.Offset()),
				Size:        int64(ext.Length()),
				ToCacheData: factory(int64(ext.OriginSize()), ext.Alg()),
			})
		}
		continue
	}

	// need fill vector
	if seqnum > maxSeqnum || blkmeta.ColumnMeta(seqnum).DataType() == 0 {
		putFillHolder(i, seqnum)
		continue
Edge Case

Mapping the last requested column if it is of TS type to SEQNUM_COMMITTS assumes the TS column is always the final one. If clients request a subset or reorder columns, this may mis-map a user TS column. Validate the assumption and consider a more explicit detection of the commit-ts attribute rather than positional last-column logic.

if col.DataType() == uint8(types.T_TS) && i == len(c.cols)-1 {
	idxs = append(idxs, objectio.SEQNUM_COMMITTS)
} else {
	idxs = append(idxs, idx)
}
typs = append(typs, tp)
Logging Noise

The new logutil.Info inside replaceCommitts will emit a log per scan with batch length; under heavy scans this can be noisy. Consider lowering to Debug or adding rate limiting/feature flags.

logutil.Info("committs replace",
	zap.Bool("createdByCN", stats.GetCNCreated()),
	zap.Bool("appendable", stats.GetAppendable()),
	zap.String("createdTS", createTS.String()),
	zap.Int("length", length),
)

Copy link

qodo-merge-pro bot commented Sep 19, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Consider a more efficient storage alternative

For immutable objects, store the commit timestamp once in the metadata instead
of as a full column to save storage and I/O. Keep the column-based approach only
for appendable objects where timestamps can vary per row.

Examples:

pkg/vm/engine/tae/tables/jobs/mergeobjects.go [370]
	seqnums = append(seqnums, objectio.SEQNUM_COMMITTS)
pkg/vm/engine/tae/tables/jobs/flushTableTail.go [583]
	seqnums = append(seqnums, objectio.SEQNUM_COMMITTS)

Solution Walkthrough:

Before:

// In mergeobjects.go (writing immutable objects)
func (task *mergeObjectsTask) PrepareNewWriter() *BlockWriter {
    seqnums := ... // get column sequence numbers
    // Always add a physical column for commit timestamps
    seqnums = append(seqnums, objectio.SEQNUM_COMMITTS)
    writer := NewWriter(seqnums)
    // ... later, a full vector with identical timestamps is written
    return writer
}

// In pnode.go (reading)
func (node *persistedNode) Scan(...) {
    // ...
    // Read the full commit timestamp column from disk
    vecs, _, err := LoadPersistedColumnDatas(...)
    // ...
}

After:

// In mergeobjects.go (writing immutable objects)
func (task *mergeObjectsTask) PrepareNewWriter() *BlockWriter {
    seqnums := ... // get column sequence numbers
    // DO NOT add a physical column for commit timestamps
    writer := NewWriter(seqnums)
    // Store the single commit timestamp in the object's metadata
    writer.SetCommitTimestamp(task.commitTs)
    return writer
}

// In pnode.go (reading)
func (node *persistedNode) Scan(...) {
    // ...
    // When commit-ts is requested for an immutable object
    if isImmutableAndCommitTsRequested {
        // Read single timestamp from metadata
        commitTs := node.object.meta.Load().GetCommitTimestamp()
        // Create a constant vector in memory
        tsVec := vector.NewConstFixed(..., commitTs, ...)
    }
    // ...
}
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a significant storage and I/O inefficiency in storing a full column of identical commit timestamps for immutable objects and proposes a valid, more efficient architectural alternative.

Medium
  • Update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Review effort 3/5 size/M Denotes a PR that changes [100,499] lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants