Skip to content

Commit ff05dc3

Browse files
committed
Hello world
0 parents  commit ff05dc3

18 files changed

Lines changed: 18269 additions & 0 deletions
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
name: Update License Database
2+
3+
on:
4+
schedule:
5+
# Run weekly on Monday at 00:00 UTC
6+
- cron: '0 0 * * 1'
7+
workflow_dispatch: # Allow manual trigger
8+
9+
jobs:
10+
update:
11+
runs-on: ubuntu-latest
12+
permissions:
13+
contents: write
14+
pull-requests: write
15+
16+
steps:
17+
- name: Checkout
18+
uses: actions/checkout@v4
19+
20+
- name: Set up Go
21+
uses: actions/setup-go@v5
22+
with:
23+
go-version: '1.22'
24+
25+
- name: Download latest license database
26+
run: |
27+
curl -sL https://scancode-licensedb.aboutcode.org/index.json -o licenses.json.full
28+
29+
# Check if file is valid JSON
30+
if ! jq empty licenses.json.full 2>/dev/null; then
31+
echo "Downloaded file is not valid JSON"
32+
exit 1
33+
fi
34+
35+
# Check if file has reasonable content
36+
count=$(jq length licenses.json.full)
37+
if [ "$count" -lt 2500 ]; then
38+
echo "Downloaded file has fewer licenses than expected: $count"
39+
exit 1
40+
fi
41+
42+
echo "Downloaded $count licenses"
43+
44+
# Filter to OSS licenses and only needed fields
45+
jq '[.[] | select(.category == "Permissive" or .category == "Copyleft" or .category == "Copyleft Limited" or .category == "Public Domain" or .category == "Free Restricted" or .category == "Source-available") | {license_key, category, spdx_license_key, other_spdx_license_keys, is_exception, is_deprecated}]' licenses.json.full > licenses.json.new
46+
rm licenses.json.full
47+
48+
filtered_count=$(jq length licenses.json.new)
49+
echo "Filtered to $filtered_count OSS licenses"
50+
51+
- name: Check for changes
52+
id: diff
53+
run: |
54+
if diff -q licenses.json licenses.json.new > /dev/null 2>&1; then
55+
echo "changed=false" >> $GITHUB_OUTPUT
56+
else
57+
echo "changed=true" >> $GITHUB_OUTPUT
58+
mv licenses.json.new licenses.json
59+
60+
# Count changes
61+
old_count=$(git show HEAD:licenses.json | jq length)
62+
new_count=$(jq length licenses.json)
63+
echo "License count: $old_count -> $new_count"
64+
fi
65+
66+
- name: Run tests
67+
if: steps.diff.outputs.changed == 'true'
68+
run: go test ./...
69+
70+
- name: Create Pull Request
71+
if: steps.diff.outputs.changed == 'true'
72+
uses: peter-evans/create-pull-request@v6
73+
with:
74+
token: ${{ secrets.GITHUB_TOKEN }}
75+
commit-message: "Update scancode license database"
76+
title: "Update scancode license database"
77+
body: |
78+
Automated update of the scancode-licensedb license database.
79+
80+
Source: https://scancode-licensedb.aboutcode.org/
81+
82+
This PR was created automatically by the weekly update workflow.
83+
branch: update-license-db
84+
delete-branch: true
85+
labels: dependencies

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2026 Andrew Nesbitt
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# spdx
2+
3+
Go library for SPDX license expression parsing, normalization, and validation.
4+
5+
Normalizes informal license strings from the real world (like "Apache 2" or "MIT License") to valid SPDX identifiers (like "Apache-2.0" or "MIT"). Useful when working with package metadata from registries where license fields often contain non-standard values.
6+
7+
## Installation
8+
9+
```bash
10+
go get github.com/git-pkgs/spdx
11+
```
12+
13+
## Usage
14+
15+
### Normalize informal license strings
16+
17+
```go
18+
import "github.com/git-pkgs/spdx"
19+
20+
// Normalize converts informal strings to valid SPDX identifiers
21+
id, err := spdx.Normalize("Apache 2") // "Apache-2.0"
22+
id, err := spdx.Normalize("MIT License") // "MIT"
23+
id, err := spdx.Normalize("GPL v3") // "GPL-3.0-or-later"
24+
id, err := spdx.Normalize("GNU General Public License") // "GPL-3.0-or-later"
25+
id, err := spdx.Normalize("BSD 3-Clause") // "BSD-3-Clause"
26+
id, err := spdx.Normalize("CC BY 4.0") // "CC-BY-4.0"
27+
```
28+
29+
### Parse and normalize expressions
30+
31+
```go
32+
// Parse handles both strict SPDX IDs and informal license names
33+
expr, err := spdx.Parse("MIT OR Apache-2.0")
34+
fmt.Println(expr.String()) // "MIT OR Apache-2.0"
35+
36+
expr, err := spdx.Parse("Apache 2 OR MIT License")
37+
fmt.Println(expr.String()) // "Apache-2.0 OR MIT"
38+
39+
expr, err := spdx.Parse("GPL v3 AND BSD 3-Clause")
40+
fmt.Println(expr.String()) // "GPL-3.0-or-later AND BSD-3-Clause"
41+
42+
// Handles operator precedence (AND binds tighter than OR)
43+
expr, err := spdx.Parse("MIT OR GPL-2.0-only AND Apache-2.0")
44+
fmt.Println(expr.String()) // "MIT OR (GPL-2.0-only AND Apache-2.0)"
45+
46+
// ParseStrict requires valid SPDX IDs (no fuzzy normalization)
47+
expr, err := spdx.ParseStrict("MIT OR Apache-2.0") // succeeds
48+
expr, err := spdx.ParseStrict("Apache 2 OR MIT") // fails
49+
```
50+
51+
### Validate licenses
52+
53+
```go
54+
// Check if a string is valid SPDX
55+
spdx.Valid("MIT OR Apache-2.0") // true
56+
spdx.Valid("FAKEYLICENSE") // false
57+
58+
// Check if a single identifier is valid
59+
spdx.ValidLicense("MIT") // true
60+
spdx.ValidLicense("Apache 2") // false (informal, not valid SPDX)
61+
62+
// Validate multiple licenses at once
63+
valid, invalid := spdx.ValidateLicenses([]string{"MIT", "Apache-2.0", "FAKE"})
64+
// valid: false, invalid: ["FAKE"]
65+
```
66+
67+
### Check license compatibility
68+
69+
```go
70+
// Check if allowed licenses satisfy an expression
71+
satisfied, err := spdx.Satisfies("MIT OR Apache-2.0", []string{"MIT"})
72+
// true
73+
74+
satisfied, err := spdx.Satisfies("MIT AND Apache-2.0", []string{"MIT"})
75+
// false (both required)
76+
```
77+
78+
### Extract licenses from expressions
79+
80+
```go
81+
licenses, err := spdx.ExtractLicenses("(MIT AND GPL-2.0-only) OR Apache-2.0")
82+
// ["Apache-2.0", "GPL-2.0-only", "MIT"]
83+
```
84+
85+
### Get license categories
86+
87+
Categories are sourced from [scancode-licensedb](https://scancode-licensedb.aboutcode.org/) (OSS licenses only) and updated weekly.
88+
89+
```go
90+
// Get the category for a license
91+
cat := spdx.LicenseCategory("MIT") // spdx.CategoryPermissive
92+
cat := spdx.LicenseCategory("GPL-3.0-only") // spdx.CategoryCopyleft
93+
cat := spdx.LicenseCategory("MPL-2.0") // spdx.CategoryCopyleftLimited
94+
cat := spdx.LicenseCategory("Unlicense") // spdx.CategoryPublicDomain
95+
96+
// Check license type
97+
spdx.IsPermissive("MIT") // true
98+
spdx.IsPermissive("GPL-3.0") // false
99+
spdx.IsCopyleft("GPL-3.0-only") // true
100+
spdx.IsCopyleft("LGPL-2.1") // true (weak copyleft)
101+
102+
// Get categories for an expression
103+
cats, err := spdx.ExpressionCategories("MIT OR GPL-3.0-only")
104+
// []Category{CategoryPermissive, CategoryCopyleft}
105+
106+
// Check expressions for copyleft
107+
spdx.HasCopyleft("MIT OR Apache-2.0") // false
108+
spdx.HasCopyleft("MIT OR GPL-3.0-only") // true
109+
spdx.IsFullyPermissive("MIT OR Apache-2.0") // true
110+
spdx.IsFullyPermissive("MIT OR GPL-3.0") // false
111+
112+
// Get detailed license info
113+
info := spdx.GetLicenseInfo("MIT")
114+
// info.Category: CategoryPermissive
115+
// info.IsException: false
116+
// info.IsDeprecated: false
117+
```
118+
119+
Available categories:
120+
- `CategoryPermissive` - MIT, Apache-2.0, BSD-*
121+
- `CategoryCopyleft` - GPL-*, AGPL-*
122+
- `CategoryCopyleftLimited` - LGPL-*, MPL-*, EPL-*
123+
- `CategoryPublicDomain` - Unlicense, CC0-1.0
124+
- `CategoryCommercial` - Commercial licenses
125+
- `CategoryProprietaryFree` - Free but proprietary
126+
- `CategorySourceAvailable` - Source-available licenses
127+
- `CategoryPatentLicense` - Patent grants
128+
- `CategoryFreeRestricted` - Free with restrictions
129+
- `CategoryCLA` - Contributor agreements
130+
- `CategoryUnstated` - No license stated
131+
132+
## Normalization examples
133+
134+
The library handles many common variations found in package registries:
135+
136+
| Input | Output |
137+
|-------|--------|
138+
| Apache 2 | Apache-2.0 |
139+
| Apache License 2.0 | Apache-2.0 |
140+
| Apache License, Version 2.0 | Apache-2.0 |
141+
| MIT License | MIT |
142+
| M.I.T. | MIT |
143+
| GPL v3 | GPL-3.0-or-later |
144+
| GNU General Public License v3 | GPL-3.0-or-later |
145+
| LGPL 2.1 | LGPL-2.1-only |
146+
| BSD 3-Clause | BSD-3-Clause |
147+
| 3-Clause BSD | BSD-3-Clause |
148+
| Simplified BSD | BSD-2-Clause |
149+
| MPL 2.0 | MPL-2.0 |
150+
| Mozilla Public License | MPL-2.0 |
151+
| CC BY 4.0 | CC-BY-4.0 |
152+
| Attribution-NonCommercial | CC-BY-NC-4.0 |
153+
| Unlicense | Unlicense |
154+
| WTFPL | WTFPL |
155+
156+
## Performance
157+
158+
Designed for processing large numbers of licenses:
159+
160+
```
161+
BenchmarkNormalize-8 49116 24381 ns/op (~5µs per license)
162+
BenchmarkNormalizeBatch-8 372 3271336 ns/op (~3.3µs per license at scale)
163+
BenchmarkParse-8 236752 5263 ns/op (includes normalization)
164+
BenchmarkValid-8 789087 1506 ns/op (strict validation)
165+
```
166+
167+
## Prior art
168+
169+
This library combines approaches from several existing implementations:
170+
171+
- [librariesio/spdx](https://github.com/librariesio/spdx) (Ruby) - Expression parsing and case normalization
172+
- [jslicense/spdx-correct.js](https://github.com/jslicense/spdx-correct.js) (JavaScript) - Fuzzy matching transforms and test cases
173+
- [EmbarkStudios/spdx](https://github.com/EmbarkStudios/spdx) (Rust) - Performance-oriented design
174+
- [github/go-spdx](https://github.com/github/go-spdx) (Go) - SPDX license list and Satisfies implementation
175+
- [aboutcode-org/scancode-licensedb](https://github.com/aboutcode-org/scancode-licensedb) - License categories and metadata
176+
177+
## License
178+
179+
MIT

0 commit comments

Comments
 (0)