Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ctrlx-reporter): Allow licenses filtering based on the classifications #9842

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions model/src/main/resources/reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,10 @@ ort:
user: user
apiKey: XYZ

CtrlXAutomation:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message nits:

  • Title: "Allow license filtering based on classifications"
  • "forbids the license" (BTW, I guess this can only be true for proprietary licenses; so maybe we should explicit say "Some proprietary licenses forbid their terms to be disclosed".)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm wondering whether this should become a general reporter option that works for all report formats... any opinions here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm wondering whether this should become a general reporter option

I've had the same thought, I do remember discussion around such feature long time ago.

I do recall having some 2 step mechanism in mind

  1. A helper to extract license IDs provide a set of categories
  2. Add some allow / deny list to the reporter.

I believe there has been an additional use case, which I do not recap. Maybe someone else does.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some proprietary licenses forbid their terms to be disclosed

@sschuberth I think it's even worse than that: there are some licenses that forbid you to even mention them publicly, not just their terms.

Also, I'm wondering whether this should become a general reporter option that works for all report formats... any opinions here?

@sschuberth, @fviernau :
The problem is, I am not sure a "one size fits" all solution would work for all reporters.

For instance for CtrlX, the logic should be the following as I understand it:

  • If a component has not license, it should be with NOASSERTIONin the report
  • If all its licenses are not classified for disclosure, it should not be in the report at all.
    This second point is needed because one of our users wants to "hide" a component from the report based on its license.

If you provide a generic filtering upfront, how would you disambiguate the two cases ?

A helper to extract license IDs provide a set of categories

Additionally, some reporters already do more of less this logic: the AsciiDocTemplateReporter has a classification filtering function FreemarkerTemplateProcessor.TemplateHelper#filterForCategory. Then the template chooses which licenses/categories to show or not.
If the filtering is done upfront, aren't we losing flexibility here ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's even worse than that: there are some licenses that forbid you to even mention them publicly, not just their terms.

That's ok. I just wanted to emphasize that all this only applies to proprietary licenses, not Open Source licenses.

If a component has not license, it should be with NOASSERTIONin the report

Shouldn't it be NONE if there really are no licenses found?

If you provide a generic filtering upfront, how would you disambiguate the two cases ?

Well, that's something to think about, maybe by adding something to ReporterInput. But still I think that the filtering itself could be implemented for all reporters, like by providing an allow-list of license classifications that should be included.

If the filtering is done upfront, aren't we losing flexibility here ?

Indeed we'd lose the possibility to filter different categories for different reports in one invocation of the reporter tool. But the question is: Do we really need this flexibility, or can we consolidate logic in order to simplify things? And if some one really needs to filter licenses differently per report, that some one could always run ort report multiple times with different configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you provide a generic filtering upfront, how would you disambiguate the two cases ?

Well, that's something to think about, maybe by adding something to ReporterInput. But still I think that the filtering itself could be implemented for all reporters, like by providing an allow-list of license classifications that should be included.

If the filtering is done upfront, aren't we losing flexibility here ?

Indeed we'd lose the possibility to filter different categories for different reports in one invocation of the reporter tool. But the question is: Do we really need this flexibility, or can we consolidate logic in order to simplify things? And if some one really needs to filter licenses differently per report, that some one could always run ort report multiple times with different configuration.

I'm not a fan of this idea, different reports have different audiences and requirements, for example, in my opinion the web app report should always show all findings, otherwise you cannot rely on it without checking the settings used to generate it.

If there is a specific requirement right now to have such a filtering for one single reporter, why not implement it as a reporter option now instead of trying to tackle the much more complex general use case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a specific requirement right now to have such a filtering for one single reporter, why not implement it as a reporter option now instead of trying to tackle the much more complex general use case?

Because IMO we need to watch a bit for not everybody just scratching his / her own itch to quickly implement an isolated feature, fragmenting the code base with heterogeneous behavior. The way reporters do repetitive work, like applying license choices etc., already is a mess, and I don't want it to become worse.

That said, maybe we should approach this differently, with different reporter sub-classes, like SBOMs, attribution documents, and technical reports (not meant for distribution), and provide tailored ReporterInputs to all of these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, you guys need to continue the discussion without me, as I'll be on vacation. I'll dismiss my review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is basically the same but from another angle, I'd rather have an isolated feature in a single reporter than an unfinished or not well thought through implementation that affects the whole reporter. And my impression is that making this a global option requires more planning to do it right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And my impression is that making this a global option requires more planning to do it right.

Ok, maybe I've underestimated the effort.

options:
licenseCategoriesToInclude: 'include-in-disclosure-document'

notifier:
mail:
hostName: 'localhost'
Expand Down
9 changes: 8 additions & 1 deletion model/src/test/kotlin/config/OrtConfigurationTest.kt
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ class OrtConfigurationTest : WordSpec({

with(ortConfig.reporter) {
config shouldNotBeNull {
keys shouldContainExactlyInAnyOrder setOf("CycloneDx", "FossId")
keys shouldContainExactlyInAnyOrder setOf("CycloneDx", "FossId", "CtrlXAutomation")

get("CycloneDx") shouldNotBeNull {
options shouldContainExactly mapOf(
Expand All @@ -390,6 +390,13 @@ class OrtConfigurationTest : WordSpec({
"apiKey" to "XYZ"
)
}

get("CtrlXAutomation") shouldNotBeNull {
options shouldContainExactly mapOf(
"licenseCategoriesToInclude" to "include-in-disclosure-document"
)
secrets should beEmpty()
}
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,38 @@ import io.kotest.core.spec.style.StringSpec
import io.kotest.engine.spec.tempdir
import io.kotest.matchers.collections.haveSize
import io.kotest.matchers.collections.shouldBeSingleton
import io.kotest.matchers.collections.shouldHaveSize
sschuberth marked this conversation as resolved.
Show resolved Hide resolved
import io.kotest.matchers.nulls.shouldNotBeNull
import io.kotest.matchers.result.shouldBeSuccess
import io.kotest.matchers.should
import io.kotest.matchers.shouldBe

import java.io.File

import kotlinx.serialization.json.decodeFromStream

import org.ossreviewtoolkit.model.AnalyzerResult
import org.ossreviewtoolkit.model.AnalyzerRun
import org.ossreviewtoolkit.model.DependencyGraph
import org.ossreviewtoolkit.model.DependencyReference
import org.ossreviewtoolkit.model.Identifier
import org.ossreviewtoolkit.model.OrtResult
import org.ossreviewtoolkit.model.Package
import org.ossreviewtoolkit.model.Project
import org.ossreviewtoolkit.model.Repository
import org.ossreviewtoolkit.model.RootDependencyIndex
import org.ossreviewtoolkit.model.Scope
import org.ossreviewtoolkit.model.VcsInfo
import org.ossreviewtoolkit.model.VcsType
import org.ossreviewtoolkit.model.licenses.LicenseCategorization
import org.ossreviewtoolkit.model.licenses.LicenseCategory
import org.ossreviewtoolkit.model.licenses.LicenseClassifications
import org.ossreviewtoolkit.plugins.reporters.ctrlx.CtrlXAutomationReporter.Companion.REPORT_FILENAME
import org.ossreviewtoolkit.reporter.ORT_RESULT
import org.ossreviewtoolkit.reporter.ReporterInput
import org.ossreviewtoolkit.utils.ort.createOrtTempDir
import org.ossreviewtoolkit.utils.spdx.SpdxSingleLicenseExpression
import org.ossreviewtoolkit.utils.spdx.toSpdx
import org.ossreviewtoolkit.utils.test.getAssetFile

class CtrlXAutomationReporterFunTest : StringSpec({
Expand All @@ -46,10 +69,121 @@ class CtrlXAutomationReporterFunTest : StringSpec({

"Generating a report works" {
val outputDir = tempdir()
val reportFiles = CtrlXAutomationReporter().generateReport(ReporterInput(ORT_RESULT), outputDir)
val reportFiles = CtrlXAutomationReporterFactory.create().generateReport(ReporterInput(ORT_RESULT), outputDir)

reportFiles.shouldBeSingleton {
it shouldBeSuccess outputDir.resolve(REPORT_FILENAME)
}
}

"Generating a report works and produces a valid fossinfo.json" {
val reporter = CtrlXAutomationReporterFactory.create()
val input = createReporterInput()
val outputDir = createOrtTempDir("ctrlx-automation-reporter-test")

val reporterResult = reporter.generateReport(input, outputDir)

validateReport(reporterResult) {
components shouldNotBeNull {
this shouldHaveSize 2
first().name shouldBe "package1"
last().name shouldBe "package2"
}
}
}

"The reporter should only include licenses with the given category" {
val category = "include-in-disclosure-document"
val categorizations = listOf(
LicenseCategorization(
SpdxSingleLicenseExpression.parse("MIT"),
setOf(category)
)
)
val categories = listOf(LicenseCategory(category))
val input = createReporterInput().copy(
licenseClassifications = LicenseClassifications(
categories = categories,
categorizations = categorizations
)
)
val reporter = CtrlXAutomationReporterFactory.create(listOf(category))
val outputDir = createOrtTempDir("ctrlx-automation-reporter-test")

val reporterResult = reporter.generateReport(input, outputDir)

validateReport(reporterResult) {
components shouldNotBeNull {
this shouldHaveSize 1
first().name shouldBe "package2"
}
}
}
})

private fun validateReport(reporterResult: List<Result<File>>, validate: FossInfo.() -> Unit) {
reporterResult.shouldBeSingleton { result ->
result shouldBeSuccess { file ->
file.name shouldBe "fossinfo.json"
val fossInfo = file.inputStream().use {
CtrlXAutomationReporter.JSON.decodeFromStream<FossInfo>(it)
}

fossInfo.validate()
}
}
}

private fun createReporterInput(): ReporterInput {
val analyzedVcs = VcsInfo(
type = VcsType.GIT,
revision = "master",
url = "https://github.com/path/first-project.git",
path = "sub/path"
)

val package1 = Package.EMPTY.copy(
id = Identifier("Maven:ns:package1:1.0"),
declaredLicenses = setOf("LicenseRef-scancode-broadcom-commercial"),
concludedLicense = "LicenseRef-scancode-broadcom-commercial".toSpdx()
)
val package2 = Package.EMPTY.copy(
id = Identifier("Maven:ns:package2:1.0"),
declaredLicenses = setOf("MIT"),
concludedLicense = "MIT".toSpdx()
)
val project = Project.EMPTY.copy(
id = Identifier.EMPTY.copy(name = "test-project"),
scopeDependencies = setOf(
Scope("scope-1", setOf(package1.toReference(), package2.toReference()))
),
vcs = analyzedVcs,
vcsProcessed = analyzedVcs
)

return ReporterInput(
OrtResult(
repository = Repository(
vcs = analyzedVcs,
vcsProcessed = analyzedVcs
),
analyzer = AnalyzerRun.EMPTY.copy(
result = AnalyzerResult(
projects = setOf(project),
packages = setOf(package1, package2),
dependencyGraphs = mapOf(
"test" to DependencyGraph(
listOf(package1.id, package2.id),
sortedSetOf(
DependencyGraph.DEPENDENCY_REFERENCE_COMPARATOR,
DependencyReference(0),
DependencyReference(1)
),
mapOf(DependencyGraph.qualifyScope(project.id, "scope-1") to listOf(RootDependencyIndex(0)))
)
)
)
)
)
)
}
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,27 @@ import org.ossreviewtoolkit.reporter.ReporterFactory
import org.ossreviewtoolkit.reporter.ReporterInput
import org.ossreviewtoolkit.utils.spdx.SpdxConstants
import org.ossreviewtoolkit.utils.spdx.SpdxLicense
import org.ossreviewtoolkit.utils.spdx.toSpdx

data class CtrlXAutomationReporterConfig(
/**
* The categories of the licenses of the packages to include in the report. If a component has a license which has a
* category not present in this parameter, the license is removed from the component and not visible in the report.
* If a component has ALL its licenses removed this way, it is not displayed in the report. If the parameter is not
* set for the reporter, all components and all licenses are present in the report.
*/
val licenseCategoriesToInclude: List<String>?
)

@OrtPlugin(
displayName = "CtrlX Automation Reporter",
description = "A reporter for the ctrlX Automation format.",
factory = ReporterFactory::class
)
class CtrlXAutomationReporter(override val descriptor: PluginDescriptor = CtrlXAutomationReporterFactory.descriptor) :
class CtrlXAutomationReporter(
override val descriptor: PluginDescriptor = CtrlXAutomationReporterFactory.descriptor,
private val config: CtrlXAutomationReporterConfig
) :
Reporter {
companion object {
const val REPORT_FILENAME = "fossinfo.json"
Expand All @@ -54,7 +68,11 @@ class CtrlXAutomationReporter(override val descriptor: PluginDescriptor = CtrlXA

override fun generateReport(input: ReporterInput, outputDir: File): List<Result<File>> {
val packages = input.ortResult.getPackages(omitExcluded = true)
val components = packages.mapTo(mutableListOf()) { (pkg, _) ->
val licensesToInclude = config.licenseCategoriesToInclude?.flatMap {
input.licenseClassifications.licensesByCategory[it].orEmpty()
}.orEmpty()

val components = packages.mapNotNullTo(mutableListOf()) { (pkg, _) ->
val qualifiedName = when (pkg.id.type) {
// At least for NPM packages, CtrlX requires the component name to be prefixed with the scope name,
// separated with a slash. Other package managers might require similar handling, but there seems to be
Expand All @@ -73,25 +91,41 @@ class CtrlXAutomationReporter(override val descriptor: PluginDescriptor = CtrlXA
input.ortResult.getPackageLicenseChoices(pkg.id),
input.ortResult.getRepositoryLicenseChoices()
)
val licenses = effectiveLicense?.decompose()?.map {
var licenses = effectiveLicense?.decompose()?.map {
val name = it.toString()
val spdxId = SpdxLicense.forId(name)?.id
val text = input.licenseTextProvider.getLicenseText(name)
License(name = name, spdx = spdxId, text = text.orEmpty())
}

// The specification requires at least one license.
val componentLicenses = licenses.orEmpty().ifEmpty { listOf(LICENSE_NOASSERTION) }

Component(
name = qualifiedName,
version = pkg.id.version,
homepage = pkg.homepageUrl.takeUnless { it.isEmpty() },
copyright = copyrights?.let { CopyrightInformation(it) },
licenses = componentLicenses,
usage = if (pkg.isModified) Usage.Modified else Usage.AsIs
// TODO: Map the PackageLinkage to an IntegrationMechanism.
)
var componentShouldBeExcluded = false

if (config.licenseCategoriesToInclude != null) {
val filteredLicenses = licenses?.filter { it.name.toSpdx() in licensesToInclude }

if (filteredLicenses != null && filteredLicenses.isEmpty()) {
componentShouldBeExcluded = true
} else {
licenses = filteredLicenses
}
}

if (componentShouldBeExcluded) {
null
} else {
// The specification requires at least one license.
val componentLicenses = licenses.orEmpty().ifEmpty { listOf(LICENSE_NOASSERTION) }

Component(
name = qualifiedName,
version = pkg.id.version,
homepage = pkg.homepageUrl.takeUnless { it.isEmpty() },
copyright = copyrights?.let { CopyrightInformation(it) },
licenses = componentLicenses,
usage = if (pkg.isModified) Usage.Modified else Usage.AsIs
// TODO: Map the PackageLinkage to an IntegrationMechanism.
)
}
}

val reportFileResult = runCatching {
Expand Down
Loading