Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ctrlx-reporter): Allow licenses filtering based on the classifications #9842

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nnobelis
Copy link
Member

Please have a look at the individual commit messages for the details.

@nnobelis nnobelis requested a review from a team as a code owner January 27, 2025 14:14
@nnobelis nnobelis changed the title Nnobelis/ctrlx use categorizations feat(ctrlx-reporter): Allow licenses filtering based on the classifications Jan 27, 2025
Copy link

codecov bot commented Jan 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.21%. Comparing base (fe56fb4) to head (f268a13).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #9842   +/-   ##
=========================================
  Coverage     68.21%   68.21%           
  Complexity     1293     1293           
=========================================
  Files           250      250           
  Lines          8849     8849           
  Branches        920      920           
=========================================
  Hits           6036     6036           
  Misses         2424     2424           
  Partials        389      389           
Flag Coverage Δ
funTest-docker 65.14% <ø> (ø)
funTest-non-docker 33.37% <ø> (ø)
test-ubuntu-24.04 35.98% <ø> (ø)
test-windows-2022 35.96% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -382,6 +382,10 @@ ort:
user: user
apiKey: XYZ

CtrlXAutomation:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message nits:

  • Title: "Allow license filtering based on classifications"
  • "forbids the license" (BTW, I guess this can only be true for proprietary licenses; so maybe we should explicit say "Some proprietary licenses forbid their terms to be disclosed".)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm wondering whether this should become a general reporter option that works for all report formats... any opinions here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm wondering whether this should become a general reporter option

I've had the same thought, I do remember discussion around such feature long time ago.

I do recall having some 2 step mechanism in mind

  1. A helper to extract license IDs provide a set of categories
  2. Add some allow / deny list to the reporter.

I believe there has been an additional use case, which I do not recap. Maybe someone else does.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some proprietary licenses forbid their terms to be disclosed

@sschuberth I think it's even worse than that: there are some licenses that forbid you to even mention them publicly, not just their terms.

Also, I'm wondering whether this should become a general reporter option that works for all report formats... any opinions here?

@sschuberth, @fviernau :
The problem is, I am not sure a "one size fits" all solution would work for all reporters.

For instance for CtrlX, the logic should be the following as I understand it:

  • If a component has not license, it should be with NOASSERTIONin the report
  • If all its licenses are not classified for disclosure, it should not be in the report at all.
    This second point is needed because one of our users wants to "hide" a component from the report based on its license.

If you provide a generic filtering upfront, how would you disambiguate the two cases ?

A helper to extract license IDs provide a set of categories

Additionally, some reporters already do more of less this logic: the AsciiDocTemplateReporter has a classification filtering function FreemarkerTemplateProcessor.TemplateHelper#filterForCategory. Then the template chooses which licenses/categories to show or not.
If the filtering is done upfront, aren't we losing flexibility here ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's even worse than that: there are some licenses that forbid you to even mention them publicly, not just their terms.

That's ok. I just wanted to emphasize that all this only applies to proprietary licenses, not Open Source licenses.

If a component has not license, it should be with NOASSERTIONin the report

Shouldn't it be NONE if there really are no licenses found?

If you provide a generic filtering upfront, how would you disambiguate the two cases ?

Well, that's something to think about, maybe by adding something to ReporterInput. But still I think that the filtering itself could be implemented for all reporters, like by providing an allow-list of license classifications that should be included.

If the filtering is done upfront, aren't we losing flexibility here ?

Indeed we'd lose the possibility to filter different categories for different reports in one invocation of the reporter tool. But the question is: Do we really need this flexibility, or can we consolidate logic in order to simplify things? And if some one really needs to filter licenses differently per report, that some one could always run ort report multiple times with different configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you provide a generic filtering upfront, how would you disambiguate the two cases ?

Well, that's something to think about, maybe by adding something to ReporterInput. But still I think that the filtering itself could be implemented for all reporters, like by providing an allow-list of license classifications that should be included.

If the filtering is done upfront, aren't we losing flexibility here ?

Indeed we'd lose the possibility to filter different categories for different reports in one invocation of the reporter tool. But the question is: Do we really need this flexibility, or can we consolidate logic in order to simplify things? And if some one really needs to filter licenses differently per report, that some one could always run ort report multiple times with different configuration.

I'm not a fan of this idea, different reports have different audiences and requirements, for example, in my opinion the web app report should always show all findings, otherwise you cannot rely on it without checking the settings used to generate it.

If there is a specific requirement right now to have such a filtering for one single reporter, why not implement it as a reporter option now instead of trying to tackle the much more complex general use case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a specific requirement right now to have such a filtering for one single reporter, why not implement it as a reporter option now instead of trying to tackle the much more complex general use case?

Because IMO we need to watch a bit for not everybody just scratching his / her own itch to quickly implement an isolated feature, fragmenting the code base with heterogeneous behavior. The way reporters do repetitive work, like applying license choices etc., already is a mess, and I don't want it to become worse.

That said, maybe we should approach this differently, with different reporter sub-classes, like SBOMs, attribution documents, and technical reports (not meant for distribution), and provide tailored ReporterInputs to all of these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, you guys need to continue the discussion without me, as I'll be on vacation. I'll dismiss my review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is basically the same but from another angle, I'd rather have an isolated feature in a single reporter than an unfinished or not well thought through implementation that affects the whole reporter. And my impression is that making this a global option requires more planning to do it right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And my impression is that making this a global option requires more planning to do it right.

Ok, maybe I've underestimated the effort.

@nnobelis nnobelis force-pushed the nnobelis/ctrlx_use_categorizations branch 2 times, most recently from b8c6008 to fec9a68 Compare January 29, 2025 09:11
@nnobelis nnobelis force-pushed the nnobelis/ctrlx_use_categorizations branch 6 times, most recently from 1f16712 to cb691c8 Compare January 30, 2025 06:16
@nnobelis nnobelis marked this pull request as draft January 30, 2025 07:20
@nnobelis
Copy link
Member Author

nnobelis commented Jan 30, 2025

I will rework this PR to follow the suggestions of @fviernau and @sschuberth .

EDIT: the changes has been made. The test error is unrelated.

@nnobelis nnobelis force-pushed the nnobelis/ctrlx_use_categorizations branch 2 times, most recently from 7e82237 to 66a45e1 Compare January 30, 2025 08:50
@nnobelis nnobelis marked this pull request as ready for review January 30, 2025 09:52
"--license-categories-to-include",
help = "The licenses of the packages have to be categorized with one of these categories, specified as " +
"comma-separated values."
).split(",")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be an option of the reporter command if it is planned that all reporters (or at least the majority) support it. But I don't think that this is the case, because reports like the WebAppReport should always show the complete data.
Also, this PR only implements this for the CtrlxReporter so I would suggest to make it a report specific option and reconsider making it a global option if it is actually needed by more reports.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you see this discussion, @mnonnenmacher?

model/src/main/kotlin/licenses/ResolvedLicenseInfo.kt Outdated Show resolved Hide resolved
@sschuberth sschuberth dismissed their stale review January 31, 2025 12:10

Unblocking this due to my vacation.

@nnobelis nnobelis force-pushed the nnobelis/ctrlx_use_categorizations branch 2 times, most recently from c2f0970 to ad3faed Compare January 31, 2025 12:31
@nnobelis
Copy link
Member Author

@mnonnenmacher

I'm not a fan of this idea, different reports have different audiences and requirements, for example, in my opinion the web app report should always show all findings, otherwise you cannot rely on it without checking the settings used to generate it.

Please note, with the current implementation, the filtering is done only if the reporter requests it, i.e. by calling filterNoCategorizedLicenses().
Therefore, even if we merge this, the WebApp report won't be affected.

@mnonnenmacher
Copy link
Member

@mnonnenmacher

I'm not a fan of this idea, different reports have different audiences and requirements, for example, in my opinion the web app report should always show all findings, otherwise you cannot rely on it without checking the settings used to generate it.

Please note, with the current implementation, the filtering is done only if the reporter requests it, i.e. by calling filterNoCategorizedLicenses(). Therefore, even if we merge this, the WebApp report won't be affected.

Yes, but the reporter option does not indicate that it has no effect on the majority of reports which will confuse users.

This new test does not only check if a report can be generated, it actually
verifies the content of the report.

Signed-off-by: Nicolas Nobelis <[email protected]>
The license terms of some proprietary licenses forbid the license to be
disclosed.
This commit adds a new optional parameter to the CtrlX reporter to specify
the categories for which the licenses are included in the report.
If a component has a license which has a category not present in this
parameter, the license is removed from the component and not visible in the
report. If a component has ALL its licenses removed this way, it is not
displayed in the report. If the parameter is not set for the reporter, all
components and all licenses are present in the report.

Signed-off-by: Nicolas Nobelis <[email protected]>
@nnobelis nnobelis force-pushed the nnobelis/ctrlx_use_categorizations branch from ad3faed to f268a13 Compare February 3, 2025 08:29
@nnobelis
Copy link
Member Author

nnobelis commented Feb 3, 2025

Reverted to a CtrlXAutomationReporter option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants