Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: parse generate property in sdf #143

Merged
merged 21 commits into from
Aug 30, 2024

Conversation

letFunny
Copy link
Collaborator

@letFunny letFunny commented Jul 2, 2024

  • Have you signed the CLA?

Co-authored-by: Rafid Bin Mostofa <[email protected]>
@letFunny letFunny added the Priority Look at me first label Jul 2, 2024
@letFunny letFunny mentioned this pull request Jul 2, 2024
1 task
Copy link
Collaborator

@cjdcordeiro cjdcordeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Alberto.

Here is a first pass.

internal/setup/setup.go Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
paths[newPath] = new
}
// An invalid "generate" value should only throw an error if that
// particular slice is selected. Hence, the check is here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems okay for now, but it's a bit unclear what the final place should be, due to the potential automatic manifest inclusion which could make this be better placed elsewhere.

internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, Alberto. We need one more shot at the algorithm.

internal/setup/setup.go Outdated Show resolved Hide resolved
for _, new := range globs {
// TODO replace with slices.Concat once we upgrade to go 1.22.
for _, old := range append(globs, copies...) {
if new.slice.Package == old.slice.Package {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduced a new type (pathSlice), a new closure (checkConflict), four new slices (rest, copies, generates, globs), copy & pastes the new checkConflict(old, new) logic three different times, duplicates a N^N loop, not to mention it re-appends a large slice to another large slice every iteration of the loop. Indeed does that to three slices on the other case below.

This is not an improvement. This logic needs tuning and I can see that you understand why, but can we please go back to the original state and fix the problem there?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, it was very messy and did a lot of copying information around that we were essentially duplicating. I was struggling to find a way to organize it so that it is clear when reading the code, which was something that did not happen in the past, but now, between the new comments and the new logic, I think I am finally happy that it is good enough (it will never be perfect). Please look at the PR diff and tell me what you think, there is still room for performance improvements, that is for sure, but I don't think it is necessary now.

Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Alberto. One more pass.

if newInfo.Kind == GlobPath {
globs[newPath] = new
// Note: We do not have to record newPath because conflict
// is a transitive relation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true. Consider the check above: the package is being taken into account to determine if there is a conflict or not. If these items are in different packages and we ignore new, keeping olding old, we might ignore conflicts that should not be ignored.

The original code here seems straightforward, and it would be nice to not change that with too many implied assumptions. Note how above we're simply checking if two things conflict, with very straighfrorwad rules: given that newPath and oldPath are the exact same string, we consider whether their content is exactly the same (SameContent) to spot a conflict. But, this is only true if we are either extracting that from the package or we're explicitly creating the content.

Copy link
Collaborator Author

@letFunny letFunny Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, the code and its assumptions were okay, the same ones we had on master basically. The problem was the comment which attempted to explain a nuanced relation in a very sort sentence. I have changed the comment to something that captures the intent with more precision because it is true that the conflict is NOT a transitive relation, it is more akin to "equivalence classes" of no-conflict where we partition by paths. However, that is again too complex so I have written the comment in the most straightforward way I could think of.

oldInfo := old.Contents[oldPath]
if !(newInfo.Kind == GlobPath || newInfo.Kind == GeneratePath ||
oldInfo.Kind == GlobPath || oldInfo.Kind == GeneratePath) {
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall loop here was de-optimized, probably in an unnecessary way.

The for loops above are already executing: for every file (content), inside every slice, of every package. So already quite a relevant expansion. Now the new code is also adding almost every one of those items to a list, and for every one inner iteration of the three earlier loops, it's looping over that whole list again. As a quick exercise, assume 10k items in the earlier loops, how many times are we executing the logic here? How many times did we go through the exact same items before rejecting them? (hint: just for the first element of the list, 10k-1 times).

That's why the original code had a globs helper here. The cost was similar, but we were paying only for items that we knew had to be handled as globs. I think we still want something similar, but need different conditions for its use as you've spotted.

Copy link
Collaborator Author

@letFunny letFunny Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, even if GlobPath is the most expensive operation here the approach in the PR does not make sense. The code that was in the PR was going to do roughly O(n^2) loop iterations (combinatorial explosion) while the previous code was using globs to reduce that by the % of globs in the slice definitions. For example, given 24.04 from chisel-releases that has ~25% globs (as of today), the previous algorithm does, in theory, 1/4 of iterations of the new one. This is especially relevant for use-cases where the % is even lower, which is what we envision for the future of the releases. The only cost is one extra map which is pretty reasonable.

I have changed the code to "tweak" the previous algorithm while solving the bugs and adding the support for generate. I am only not sure about the naming of the new map, but that is a very minor thing.

Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks great, thanks! Only trivials now:

internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Show resolved Hide resolved
internal/setup/setup.go Outdated Show resolved Hide resolved
internal/setup/setup.go Show resolved Hide resolved
@niemeyer niemeyer merged commit ae52f84 into canonical:main Aug 30, 2024
14 checks passed
@letFunny letFunny deleted the chisel-db-parse-generate branch October 17, 2024 08:36
zhijie-yang pushed a commit to zhijie-yang/chisel that referenced this pull request Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority Look at me first
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants