Skip to content

Add poisson upscaling without storing count in the sample types #908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions profiling/src/api.rs
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,12 @@ pub enum UpscalingInfo {
count_value_offset: usize,
sampling_distance: u64,
},
PoissonNonSampleTypeCount {
// sum_value_offset is an offset in the profile values type array
sum_value_offset: usize,
count_value: u64,
sampling_distance: u64,
},
Proportional {
scale: f64,
},
Expand All @@ -213,6 +219,15 @@ impl std::fmt::Display for UpscalingInfo {
"Poisson = sum_value_offset: {}, count_value_offset: {}, sampling_distance: {}",
sum_value_offset, count_value_offset, sampling_distance
),
UpscalingInfo::PoissonNonSampleTypeCount {
sum_value_offset,
count_value,
sampling_distance,
} => write!(
f,
"Poisson = sum_value_offset: {}, count_value: {}, sampling_distance: {}",
sum_value_offset, count_value, sampling_distance
),
Comment on lines +226 to +230
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy would probably suggest this:

Suggested change
} => write!(
f,
"Poisson = sum_value_offset: {}, count_value: {}, sampling_distance: {}",
sum_value_offset, count_value, sampling_distance
),
} => write!(
f,
"Poisson = sum_value_offset: {sum_value_offset}, count_value: {count_value}, sampling_distance: {sampling_distance}"
),

(or something close, writing code in a text box in a browser is hard)

This kind of thing is repeated a few times in this file (in and out of the PR).

Copy link
Member Author

@realFlowControl realFlowControl Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly clippy does not complain, but I see value in the proposed change.
I'd tend keep it as is for consistency reasons and clean up all of this in follow-up PR, WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #913

UpscalingInfo::Proportional { scale } => {
write!(f, "Proportional = scale: {}", scale)
}
Expand Down Expand Up @@ -241,6 +256,28 @@ impl UpscalingInfo {
sampling_distance
)
}
UpscalingInfo::PoissonNonSampleTypeCount {
sum_value_offset,
count_value,
sampling_distance,
} => {
anyhow::ensure!(
sum_value_offset < &number_of_values,
"sum_value_offset {} must be strictly less than {}",
sum_value_offset,
number_of_values
);
anyhow::ensure!(
count_value != &0,
"count_value {} must be greater than 0",
count_value
);
anyhow::ensure!(
sampling_distance != &0,
"sampling_distance {} must be greater than 0",
sampling_distance
)
}
UpscalingInfo::Proportional { scale: _ } => (),
}
anyhow::Ok(())
Expand Down
32 changes: 32 additions & 0 deletions profiling/src/internal/profile/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1428,6 +1428,38 @@ mod api_tests {
assert_eq!(first.values, vec![1, 298, 29]);
}

#[test]
fn test_upscaling_by_value_on_one_value_with_poisson_count() {
let sample_types = create_samples_types();

let mut profile = Profile::new(SystemTime::now(), &sample_types, None);

let sample1 = api::Sample {
locations: vec![],
values: vec![1, 16, 29],
labels: vec![],
};

profile.add_sample(sample1, None).expect("add to success");

let upscaling_info = UpscalingInfo::PoissonNonSampleTypeCount {
sum_value_offset: 1,
count_value: 29,
sampling_distance: 10,
};
let values_offset: Vec<usize> = vec![1];
profile
.add_upscaling_rule(values_offset.as_slice(), "", "", upscaling_info)
.expect("Rule added");

let serialized_profile = pprof::roundtrip_to_pprof(profile).unwrap();

assert_eq!(serialized_profile.samples.len(), 1);
let first = serialized_profile.samples.first().expect("one sample");

assert_eq!(first.values, vec![1, 298, 29]);
}

#[test]
fn test_upscaling_by_value_on_zero_value_with_poisson() {
let sample_types = create_samples_types();
Expand Down
14 changes: 14 additions & 0 deletions profiling/src/internal/upscaling.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,20 @@ impl UpscalingRule {
let avg = values[sum_value_offset] as f64 / values[count_value_offset] as f64;
1_f64 / (1_f64 - (-avg / sampling_distance as f64).exp())
}
UpscalingInfo::PoissonNonSampleTypeCount {
sum_value_offset,
count_value,
sampling_distance,
} => {
// This should not happen, but if it happens,
// do not upscale
if values[sum_value_offset] == 0 || count_value == 0 {
return 1_f64;
}

let avg = values[sum_value_offset] as f64 / count_value as f64;
1_f64 / (1_f64 - (-avg / sampling_distance as f64).exp())
}
UpscalingInfo::Proportional { scale } => scale,
}
}
Expand Down
Loading