Skip to content

Conversation

@tanujnay112
Copy link
Contributor

@tanujnay112 tanujnay112 commented Nov 21, 2025

Description of changes

Summarize the changes made by this PR.

This diff changes all functions (statistics, record_counter) to be incremental. Every run they read current data from the output and use incoming log data to produce updates to the output collection. This also adds total_count as a statistic record.

There was a bug where the AttachedFunctionOrchestrator didn't create a RecordSegment reader before this change that is fixed in this change.

  • Improvements & Bug fixes
    • ...
  • New functionality
    • ...

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Migration plan

Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?

Observability plan

What is the plan to instrument and monitor this change?

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the _docs section?_

@github-actions
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@tanujnay112 tanujnay112 marked this pull request as ready for review November 21, 2025 09:52
@propel-code-bot
Copy link
Contributor

propel-code-bot bot commented Nov 21, 2025

Introduce incremental attached-function execution (statistics & record-counter)

This PR refactors the statistics and record-counter attached functions so they operate incrementally instead of writing a full refresh on every run. Existing output is loaded via RecordSegmentReader, deltas from new logs are merged, counts are incremented/decremented, and stale statistics are removed. The change ripples through the executor/​operator/​orchestrator stack, adds total_count summary statistics, and updates extensive test coverage.

Key Changes

• Redesigned trait StatisticsFunction (new methods observe_insert, observe_delete, is_changed, is_empty, as_any_mut)
• Extended CounterFunction with change-tracking and factory method with_initial_value
• Re-implemented StatisticsFunctionExecutor to load existing stats (load_existing_statistics), apply deltas, emit deletes for zero counts, and add summary key summary::s:total_count
• Added incremental logic to CountAttachedFunction including reading existing counts from output
• Updated ExecuteAttachedFunctionOperator to pass optional RecordSegmentReader for stateful executors and to support rebuild vs incremental modes
• Fixed AttachedFunctionOrchestrator bug: now creates output RecordSegmentReader, wires input segment reader, and propagates rebuild flag
• Removed test-only helper MaterializeLogsResult::from_logs_for_test; replaced tests with real materialize_logs flow
• Large test suite updates (+200 lines) covering inserts, deletes, updates, rebuild paths, and new summary statistics

Affected Areas

rust/worker/src/execution/functions/statistics.rs
rust/worker/src/execution/operators/execute_task.rs
rust/worker/src/execution/orchestration/attached_function_orchestrator.rs
rust/segment/src/types.rs (test helper removal)
• unit/integration tests

This summary was automatically generated by @propel-code-bot

@tanujnay112 tanujnay112 force-pushed the make_functions_great_again branch 2 times, most recently from 09a23f0 to 2bfcc0b Compare November 21, 2025 09:54
Comment on lines +220 to +287
let key = match metadata.get("key") {
Some(MetadataValue::Str(k)) => k.clone(),
_ => continue,
};

let value_type = match metadata.get("type") {
Some(MetadataValue::Str(t)) => t.as_str(),
_ => continue,
};

let value_str = match metadata.get("value") {
Some(MetadataValue::Str(v)) => v.as_str(),
_ => continue,
};

let count = match metadata.get("count") {
Some(MetadataValue::Int(c)) => *c,
_ => continue,
};

// Reconstruct the StatisticsValue from type and value
let stats_value = match value_type {
"bool" => match value_str {
"true" => StatisticsValue::Bool(true),
"false" => StatisticsValue::Bool(false),
_ => continue,
},
"int" => match value_str.parse::<i64>() {
Ok(i) => StatisticsValue::Int(i),
_ => continue,
},
"float" => match value_str.parse::<f64>() {
Ok(f) => StatisticsValue::Float(f),
_ => continue,
},
"str" => StatisticsValue::Str(value_str.to_string()),
"sparse" => match value_str.parse::<u32>() {
Ok(index) => StatisticsValue::SparseVector(index),
_ => continue,
},
_ => continue,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

In load_existing_statistics, parsing errors for records read from the output segment are handled by silently continueing. This could hide issues with data corruption in the output segment and lead to incorrect statistics being calculated. Consider adding logging (e.g., tracing::warn!) when a record is skipped due to a parsing failure. This would improve observability into the health of the system.

Context for Agents
In `load_existing_statistics`, parsing errors for records read from the output segment are handled by silently `continue`ing. This could hide issues with data corruption in the output segment and lead to incorrect statistics being calculated. Consider adding logging (e.g., `tracing::warn!`) when a record is skipped due to a parsing failure. This would improve observability into the health of the system.

File: rust/worker/src/execution/functions/statistics.rs
Line: 261

Comment on lines +50 to +79
async fn get_existing_count(output_reader: Option<&RecordSegmentReader<'_>>) -> i64 {
let Some(reader) = output_reader else {
return 0;
};

// Try to get the existing record with the function output ID
let offset_id = match reader
.get_offset_id_for_user_id(COUNT_FUNCTION_OUTPUT_ID)
.await
{
Ok(Some(offset_id)) => offset_id,
_ => return 0,
};

// Get the data record for this offset id
let data_record = match reader.get_data_for_offset_id(offset_id).await {
Ok(Some(data_record)) => data_record,
_ => return 0,
};

// Extract total_count from metadata
if let Some(metadata) = &data_record.metadata {
if let Some(chroma_types::MetadataValue::Int(count)) = metadata.get(COUNT_METADATA_KEY)
{
return *count;
}
}

0
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

The get_existing_count function currently swallows errors from the RecordSegmentReader and defaults to returning 0. For instance, if reader.get_offset_id_for_user_id fails due to a transient I/O issue, the function will return 0 instead of propagating the error. This could cause the total count to be incorrectly reset.

It would be more robust to change the function signature to return a Result<i64, Box<dyn ChromaError>> and propagate any errors encountered during reading. The caller in execute can then handle the error appropriately using ?.

Context for Agents
The `get_existing_count` function currently swallows errors from the `RecordSegmentReader` and defaults to returning `0`. For instance, if `reader.get_offset_id_for_user_id` fails due to a transient I/O issue, the function will return `0` instead of propagating the error. This could cause the total count to be incorrectly reset.

It would be more robust to change the function signature to return a `Result<i64, Box<dyn ChromaError>>` and propagate any errors encountered during reading. The caller in `execute` can then handle the error appropriately using `?`.

File: rust/worker/src/execution/operators/execute_task.rs
Line: 79

@blacksmith-sh

This comment has been minimized.

@blacksmith-sh

This comment has been minimized.

@tanujnay112 tanujnay112 changed the base branch from make_functions_great_again to graphite-base/5893 November 21, 2025 22:19
@tanujnay112 tanujnay112 changed the base branch from graphite-base/5893 to main November 22, 2025 11:51
@blacksmith-sh

This comment has been minimized.

Copy link
Contributor

@rescrv rescrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what happens if there's a delete of something that doesn't exist. Is it taken care of by hydrateLogRecords or something else?

struct MaterializedLogRecord {
// False if the record exists only in the log, otherwise true.
offset_id_exists_in_segment: bool,
pub offset_id_exists_in_segment: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make a getter method, unless you want this to change at any time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, residue from when i was testing something

pub trait StatisticsFunction: std::fmt::Debug + Send {
fn observe(&mut self, hydrated_record: &HydratedMaterializedLogRecord<'_, '_>);
fn observe_insert(&mut self, hydrated_record: &HydratedMaterializedLogRecord<'_, '_>);
fn observe_delete(&mut self, hydrated_record: &HydratedMaterializedLogRecord<'_, '_>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels odd to pass in a variant. The old form recognized that HMLR would be variant. This has potential mismatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, leaving a TODO

),
),
(
"summary::s:total_count".to_string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the ::s for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's from the id naming scheme that you had made in this file {key}::{type_id}:{value}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I didn't make that connection having not looked at that code recently. A comment would be nice, but is OK to do in follow-up.

Copy link
Contributor Author

Materialize_logs before this takes care of that

@tanujnay112 tanujnay112 merged commit 577a312 into main Nov 23, 2025
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants