-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: parse generate property in sdf #143
Conversation
letFunny
commented
Jul 2, 2024
- Have you signed the CLA?
Co-authored-by: Rafid Bin Mostofa <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Don't forget to update https://github.com/canonical/chisel?tab=readme-ov-file#path-kinds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Alberto.
Here is a first pass.
paths[newPath] = new | ||
} | ||
// An invalid "generate" value should only throw an error if that | ||
// particular slice is selected. Hence, the check is here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems okay for now, but it's a bit unclear what the final place should be, due to the potential automatic manifest inclusion which could make this be better placed elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, Alberto. We need one more shot at the algorithm.
internal/setup/setup.go
Outdated
for _, new := range globs { | ||
// TODO replace with slices.Concat once we upgrade to go 1.22. | ||
for _, old := range append(globs, copies...) { | ||
if new.slice.Package == old.slice.Package { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change introduced a new type (pathSlice), a new closure (checkConflict), four new slices (rest, copies, generates, globs), copy & pastes the new checkConflict(old, new) logic three different times, duplicates a N^N loop, not to mention it re-appends a large slice to another large slice every iteration of the loop. Indeed does that to three slices on the other case below.
This is not an improvement. This logic needs tuning and I can see that you understand why, but can we please go back to the original state and fix the problem there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, it was very messy and did a lot of copying information around that we were essentially duplicating. I was struggling to find a way to organize it so that it is clear when reading the code, which was something that did not happen in the past, but now, between the new comments and the new logic, I think I am finally happy that it is good enough (it will never be perfect). Please look at the PR diff and tell me what you think, there is still room for performance improvements, that is for sure, but I don't think it is necessary now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Alberto. One more pass.
internal/setup/setup.go
Outdated
if newInfo.Kind == GlobPath { | ||
globs[newPath] = new | ||
// Note: We do not have to record newPath because conflict | ||
// is a transitive relation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true. Consider the check above: the package is being taken into account to determine if there is a conflict or not. If these items are in different packages and we ignore new, keeping olding old, we might ignore conflicts that should not be ignored.
The original code here seems straightforward, and it would be nice to not change that with too many implied assumptions. Note how above we're simply checking if two things conflict, with very straighfrorwad rules: given that newPath and oldPath are the exact same string, we consider whether their content is exactly the same (SameContent) to spot a conflict. But, this is only true if we are either extracting that from the package or we're explicitly creating the content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, the code and its assumptions were okay, the same ones we had on master basically. The problem was the comment which attempted to explain a nuanced relation in a very sort sentence. I have changed the comment to something that captures the intent with more precision because it is true that the conflict is NOT a transitive relation, it is more akin to "equivalence classes" of no-conflict where we partition by paths. However, that is again too complex so I have written the comment in the most straightforward way I could think of.
internal/setup/setup.go
Outdated
oldInfo := old.Contents[oldPath] | ||
if !(newInfo.Kind == GlobPath || newInfo.Kind == GeneratePath || | ||
oldInfo.Kind == GlobPath || oldInfo.Kind == GeneratePath) { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall loop here was de-optimized, probably in an unnecessary way.
The for loops above are already executing: for every file (content), inside every slice, of every package. So already quite a relevant expansion. Now the new code is also adding almost every one of those items to a list, and for every one inner iteration of the three earlier loops, it's looping over that whole list again. As a quick exercise, assume 10k items in the earlier loops, how many times are we executing the logic here? How many times did we go through the exact same items before rejecting them? (hint: just for the first element of the list, 10k-1 times).
That's why the original code had a globs
helper here. The cost was similar, but we were paying only for items that we knew had to be handled as globs. I think we still want something similar, but need different conditions for its use as you've spotted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, even if GlobPath is the most expensive operation here the approach in the PR does not make sense. The code that was in the PR was going to do roughly O(n^2)
loop iterations (combinatorial explosion) while the previous code was using globs
to reduce that by the % of globs in the slice definitions. For example, given 24.04 from chisel-releases that has ~25% globs (as of today), the previous algorithm does, in theory, 1/4 of iterations of the new one. This is especially relevant for use-cases where the % is even lower, which is what we envision for the future of the releases. The only cost is one extra map which is pretty reasonable.
I have changed the code to "tweak" the previous algorithm while solving the bugs and adding the support for generate. I am only not sure about the naming of the new map, but that is a very minor thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks great, thanks! Only trivials now: