-
Notifications
You must be signed in to change notification settings - Fork 0
SBOM export support #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5e494c7 to
8c3a41b
Compare
2853f6a to
3b592f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I think it's be nice to keep this in the export_format.rs module, since it's conceptually still a set of export formats, just limited to pip compile 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, done!
Yes, please 🙂 |
|
I'm not sure if we talk about export much in the documentation right now. The most helpful thing would be to write a section explaining the format, how you'd generate it, and then how you'd consume it. I can figure out how to fit that content into the larger documentation story. |
|
Did you explore just making this format an enum member that's banned in |
Are you meaning something like adding a runtime error when users attempt to use the SBOM format in Also, we'd then need to add error handling to reject that format option when parsing the CLI args, and then add either errors or Let me know if I'm missing something though or if you have other thoughts! |
feb3c90 to
41d8858
Compare
|
Those seem like reasonable reasons to avoid it. |
| CycloneDX1_5, | ||
| } | ||
|
|
||
| /// The format to use when compiling to a `requirements.txt` or `pylock.toml` file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be
| /// The format to use when compiling to a `requirements.txt` or `pylock.toml` file. | |
| /// The output format to use in `uv pip compile`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| Source::Git(url, _) | Source::Direct(url, _) => { | ||
| ("generic", format!("?download_url={}", url.as_ref())) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised to see Git dependencies are just "generic"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for flagging - for GitHub URLs specifically there is a github purl type which we should be using here, although for other Git sources I'm not aware of anything more specific. We'll revisit this to handle that case!
| package.id.name.to_string() | ||
| } | ||
|
|
||
| /// Generate a Package URL (purl) from a package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should indicate when this will return None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
|
||
| Component { | ||
| component_type: classification, | ||
| name: NormalizedString::new(&name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little weird that you use get_package_name to get a String then use &str here to construct a new NormalizedString owned type. I would probably omit the get_package_name and get_version_string helpers or at least avoid returning owned values since the caller needs borrowed ones anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - I've updated get_package_name to return a &str, although for get_version_string it looks like Version implements Display (here) to get a String but no way to get an &str (at least that I can see)
| } | ||
|
|
||
| #[test] | ||
| fn cyclonedx_export() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also test a direct URL, e.g., by copying a wheel URL from PyPI
594c9f4 to
2fa4cd5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add tests? maybe snapshot-tests on prepared test beds.?
Since you are crafting CycloneDX based on "untrusted" data, I'd encourage to also validate the SBOM results before returning it.
If you don't want to validate the SBOM in the production code, I would still validate them in tests, to show that you are producing valid SBOMS.
PS: I am not a rust export. but I understand that bom.validate() would run certain validations - for example that bomRefs are unique, and other schema- and plausibility-checks.
| let version_specifier = version.map_or_else(String::new, |v| format!("@{v}")); | ||
|
|
||
| Some(format!( | ||
| "pkg:{purl_type}/{name}{version_specifier}{qualifiers}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do this yourself?
there is an official PURL library to craft proper package URL strings: https://github.com/package-url/packageurl.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zanieb / @woodruffw any objections to me pulling that additional crate in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the "official" library is mere a backup/fork of another lib's source code:
https://github.com/scm-rs/packageurl.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're just constructing PURLs not parsing them, does a library have much to offer here?
I'm generally wary of more dependencies, especially for something straight-forward. I'm not sure what to think of there being a library that's just a fork of another here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my hint with the fork is jost to give a heads up to use the right library, which appears to be https://github.com/scm-rs/packageurl.rs.
anyway, PURL is not just concatenating strings the correct way. Especially when using entrusted user input, crafting a malformed PURL is just so easy.
i mean, are you even aware of all the characters that are not allowed in a PURL, and the characters that must be percent-encoded, or the characters that must NOT be percent-encoded? (there is no sanitation or handing of these cases in the current code yet)
I encourage using a dedicated library for crafting valid PURL strings - otherwise you will just implement a thing that has been solved already....
PS: see the wrong implementation in action, where a library should have done the job -- #4 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is being constructed from untrusted user input, this is being constructed from the output of a uv resolution and each of these types is validated, e.g., the package name and version specifiers are already parsed and normalized. This isn't to say we shouldn't use a package to construct the PURL, but we're not concatenating arbitrary strings here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transitive deps from that crate don't look too bad (and are already in our transitive closure), but I also agree with @zanieb that we don't have any arbitrary inputs in this context -- we know by construction that our name and version are well-formed. I'm also similarly hesitant to bring in another sidecar dep (and I similarly find the "official" fork thing confusing -- which one is authoritative if they diverge?). Overall I'd lean towards doing it ourselves here, unless there's some hidden complexity in the small subset of purls we emit.
(We do probably need to percent-encode the values in the qualifier, however, if I'm reading the purl spec correctly.)
@jkowalleck there are already some tests here and we're planning on adding some more - is this what you were thinking of or did you have other ideas? |
crates/uv/tests/it/export.rs
Outdated
| "bom-ref": "[email protected]", | ||
| "name": "flask", | ||
| "version": "2.3.3", | ||
| "purl": "pkg:generic/[email protected]?download_url=https://github.com/pallets/flask.git?rev=2.3.3#3205b53c7cf69d17fee49cac6b84978175b7dd73" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
broken result. the download_url was not properly escaped, so that the hash and the query string are propagated to the purl ...
expected outcome: pkg:generic/[email protected]?vcs_url=https://github.com/pallets/flask.git%3Frev%3D2.3.3%233205b53c7cf69d17fee49cac6b84978175b7dd73
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Fixed
ae20eeb to
80084c1
Compare
80084c1 to
803191b
Compare
16fb577 to
4ddba58
Compare
SBOM export support - more changes
See PR here for reasoning: CycloneDX/cyclonedx-property-taxonomy#142
527a2b5 to
8647def
Compare
docs: uv export documentation
|
Closing in favour of this PR |
To do: