Skip to content

Improves license name matching. #1170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

StephanPartzsch
Copy link
Contributor

  • Adds dependencies to sample project to test license name parsing
  • Improves name matching for:
    • GNU Lesser General Public License (LGPL), Version 2.1
    • Mozilla Public License Version 2.0
  • Adds some meta information for:
    • ASDKL
    • CTS
  • Adds license name info for:
    • Play Core Software Development Kit Terms of Service

Copy link
Owner

@mikepenz mikepenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the PR.

Some minor feedback.

Copy link
Owner

@mikepenz mikepenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that the LGPL matching could wrongfully match with a different license.

@@ -208,7 +208,10 @@ enum class SpdxLicense(
Leptonica("Leptonica License", "Leptonica"),
LGPL_2_0_only("GNU Library General Public License v2 only", "LGPL-2.0-only"),
LGPL_2_0_or_later("GNU Library General Public License v2 or later", "LGPL-2.0-or-later"),
LGPL_2_1_only("GNU Lesser General Public License v2.1 only", "LGPL-2.1-only"),
LGPL_2_1_only("GNU Lesser General Public License v2.1 only", "LGPL-2.1-only", customMatcher = { name, _ ->
name.contains("GNU Lesser General Public License", true)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could accidentally match with the GNU Lesser General Public License v2.1 or later" license

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you are absolutely right. I fixed this. Do you agree with the solution?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to include the term only as required. as !later might not be 100% accurate all the times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get it working without any drawbacks, I decided to use a regex to match the name. It is a bit harder to read, but should work much better. The other solution would be to leave the rule as it was at the beginning and add a rule for the exact (slightly) other name. What do you think?

@StephanPartzsch StephanPartzsch requested a review from mikepenz May 13, 2025 06:53
Comment on lines 247 to 248
name.contains("Mozilla Public License", true)
&& name.contains("2.0", true)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

practically we have the same situation here as well. this would also match the MPL_2_0_no_copyleft_exception license.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have an example library which you try to fix the matching for? maybe there's an alternative rule that is more secure. like matching the url?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used regex here as well.

Do you remember the commented out implementations in build.gradle.kts? These are dependencies with the libraries and the licence names I try to match. With these you can try it out

@StephanPartzsch StephanPartzsch requested a review from mikepenz May 13, 2025 10:49
@@ -239,7 +246,14 @@ enum class SpdxLicense(
mpich2("mpich2 License", "mpich2"),
MPL_1_0("Mozilla Public License 1.0", "MPL-1.0"),
MPL_1_1("Mozilla Public License 1.1", "MPL-1.1"),
MPL_2_0("Mozilla Public License 2.0", "MPL-2.0"),
MPL_2_0("Mozilla Public License 2.0", "MPL-2.0",
customMatcher = { name, _ ->
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think matching for the url to contains mozilla.org/MPL/2.0 is probably an easier and more secure approach in that case. (also should be cheaper than constructing a regex)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://repo1.maven.org/maven2/com/github/librepdf/openpdf-parent/1.3.43/openpdf-parent-1.3.43.pom

Here the url doesn't contain mozilla.org/MPL/2.0 but the parts mozilla.org and MPL/2.0. That's why I splittet the check.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

if only all would use spdx identifiers :D

@StephanPartzsch StephanPartzsch requested a review from mikepenz May 13, 2025 11:57
@StephanPartzsch StephanPartzsch requested a review from mikepenz May 13, 2025 12:31
@mikepenz
Copy link
Owner

Thank you again @StephanPartzsch for the PR and for the patience to apply all the feedback.

@mikepenz mikepenz merged commit ac9a2a2 into mikepenz:develop May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants