-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend Fuzzer to Check Debug Locations #200
base: main
Are you sure you want to change the base?
Conversation
… checker debug locations errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I have a few thoughts below.
My main thoughts are around algorithmic complexity -- it seems we're scanning the join of the allocation results, the debug-labels list, and the program in various combinations that could result in quadratic or cubic checking cost, in a way that might significantly slow down fuzzing.
At a high level I think what we might want to do is to build searchable maps of both the value-labels input, and the debug-locations output, and then whenever a new vreg is defined in a physical register, check if the vreg carries a label and if so check that the label appears in the debug data, or something like that. What do you think?
if end_inst <= point_start.inst() || start_inst >= point_end.inst() { | ||
None | ||
} else { | ||
let point0 = if point_start.inst() >= start_inst { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ProgPoint
implements Ord
, so could we use std::cmp::min
/ std::cmp::max
here?
]; | ||
let mut result = run(&f, &mach_env, &options).unwrap(); | ||
/* | ||
The correct debug_locations output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we waiting for a fix before asserting this output instead? If so could we add a note here that this is to be asserted once we fix [thing]?
if end_inst <= start_point.inst() || start_inst >= end_point.inst() { | ||
None | ||
} else { | ||
let point0 = if start_point.inst() >= start_inst { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here to below -- std::cmp::min
/ std::cmp::max
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And can we name this intersects
, and factor it out to a common method on ProgPoint
shared by the other use below? I see the args are slightly different -- ProgPoint/Inst here, Inst/ProgPoint below, but unless there are subtleties around the handling of begin/end, we could take ProgPoints for both ranges and convert Insts to ProgPoints as needed at callsites I think?
fn new<F: Function>(f: &F, output: &Output) -> Self { | ||
let mut expected_vreg_locations = FxHashMap::default(); | ||
for (label, start_point, end_point, alloc) in &output.debug_locations { | ||
for (vreg, start_inst, end_inst, in_label) in f.debug_value_labels() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried that this could result in quadratic runtime during fuzzing (and that the fuzzer will discover and exercise this worst case) -- could we optimize this somehow by building a map? For example we could have a label -> [array of allocs] dense-map, where [array of allocs] is indexed by the ProgPoint's index; then when processing debug_value_labels
we can cross-reference that.
|
||
fn entries_covering(&self, inst: Inst) -> Vec<(bool, bool, DebugLocationEntry)> { | ||
let mut entries = vec![]; | ||
for entry in self.expected_vreg_locations.keys().copied() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is another place I'm a bit worried about checker-time blowup: we're iterating over all vreg locations (which is O(|num vregs| * |program size|)) once per inst
, so if |vregs| ~= |insts|, this is approximately cubic overall, right?
@cfallin, I'm definitely aware that this is inefficient: I actually just wanted some thoughts on whether or not the correctness of the debug locations is actually being checked reasonably. My bad for not being explicit about that in the initial comment 😅. |
@d-sonuga ah, in that case, yes the checks do look correct at least. I'm happy to leave this open as a draft if you'd like or we can close it, up to you. |
I'd prefer to leave it open as a draft. I still intend to work on this. |
For #194.
The checker can now check whether the entries in the
debug_locations
output contain the vregs they're expected to contain as indicated byFunction::debug_value_labels
.The checking doesn't happen by default; instead, it's only enabled during fuzzing.