Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Fuzzer to Check Debug Locations #200

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

d-sonuga
Copy link
Contributor

For #194.
The checker can now check whether the entries in the debug_locations output contain the vregs they're expected to contain as indicated by Function::debug_value_labels.
The checking doesn't happen by default; instead, it's only enabled during fuzzing.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I have a few thoughts below.

My main thoughts are around algorithmic complexity -- it seems we're scanning the join of the allocation results, the debug-labels list, and the program in various combinations that could result in quadratic or cubic checking cost, in a way that might significantly slow down fuzzing.

At a high level I think what we might want to do is to build searchable maps of both the value-labels input, and the debug-locations output, and then whenever a new vreg is defined in a physical register, check if the vreg carries a label and if so check that the label appears in the debug data, or something like that. What do you think?

if end_inst <= point_start.inst() || start_inst >= point_end.inst() {
None
} else {
let point0 = if point_start.inst() >= start_inst {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProgPoint implements Ord, so could we use std::cmp::min / std::cmp::max here?

];
let mut result = run(&f, &mach_env, &options).unwrap();
/*
The correct debug_locations output
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we waiting for a fix before asserting this output instead? If so could we add a note here that this is to be asserted once we fix [thing]?

if end_inst <= start_point.inst() || start_inst >= end_point.inst() {
None
} else {
let point0 = if start_point.inst() >= start_inst {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here to below -- std::cmp::min / std::cmp::max?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And can we name this intersects, and factor it out to a common method on ProgPoint shared by the other use below? I see the args are slightly different -- ProgPoint/Inst here, Inst/ProgPoint below, but unless there are subtleties around the handling of begin/end, we could take ProgPoints for both ranges and convert Insts to ProgPoints as needed at callsites I think?

fn new<F: Function>(f: &F, output: &Output) -> Self {
let mut expected_vreg_locations = FxHashMap::default();
for (label, start_point, end_point, alloc) in &output.debug_locations {
for (vreg, start_inst, end_inst, in_label) in f.debug_value_labels() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried that this could result in quadratic runtime during fuzzing (and that the fuzzer will discover and exercise this worst case) -- could we optimize this somehow by building a map? For example we could have a label -> [array of allocs] dense-map, where [array of allocs] is indexed by the ProgPoint's index; then when processing debug_value_labels we can cross-reference that.


fn entries_covering(&self, inst: Inst) -> Vec<(bool, bool, DebugLocationEntry)> {
let mut entries = vec![];
for entry in self.expected_vreg_locations.keys().copied() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another place I'm a bit worried about checker-time blowup: we're iterating over all vreg locations (which is O(|num vregs| * |program size|)) once per inst, so if |vregs| ~= |insts|, this is approximately cubic overall, right?

@d-sonuga
Copy link
Contributor Author

@cfallin, I'm definitely aware that this is inefficient: I actually just wanted some thoughts on whether or not the correctness of the debug locations is actually being checked reasonably.

My bad for not being explicit about that in the initial comment 😅.

@cfallin
Copy link
Member

cfallin commented Nov 21, 2024

@d-sonuga ah, in that case, yes the checks do look correct at least. I'm happy to leave this open as a draft if you'd like or we can close it, up to you.

@d-sonuga
Copy link
Contributor Author

I'd prefer to leave it open as a draft. I still intend to work on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants