-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor pipeline to extract reusable basepipeline in module and library #1332
Conversation
Signed-off-by: tdruez <[email protected]>
Signed-off-by: tdruez <[email protected]>
Signed-off-by: Keshav Priyadarshi <[email protected]>
Signed-off-by: Keshav Priyadarshi <[email protected]>
ac54f98
to
104d784
Compare
Signed-off-by: Keshav Priyadarshi <[email protected]>
ceeb581
to
c661079
Compare
The code in this branch is currently experimental and not ready for merge/production. |
@@ -0,0 +1,330 @@ | |||
# SPDX-License-Identifier: Apache-2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the "pipeline" module name needs to be changed to something better that will not conflict when installed with other libraries.
I suggest either:
-
using a namespace (we never did before, but we could start). This would mean creating a top level
aboutcode
directory with no__init__.py
and a few more details TBD. This is PEP420 https://peps.python.org/pep-0420/ and https://docs.python.org/3/glossary.html#term-namespace-package and this namespace would become reusable across other projects for shared modules. This is OK because we own aboutcode on PyPI and none of our projects ever used theaboutcode
module name. Some exmaples include jaraco's repos at https://github.com/jaraco/jaraco.vcs https://github.com/jaraco/jaraco.abode and https://github.com/jaraco/jaraco.context all living in the same "jaraco" namespace.
In this case I would suggest that we call the package pipeline under aboutcode namespace. And end up with this pathaboutcode/pipeline/__init__.py
-
using another name for the package directory that contains
__init__.py
and use that as a regular, non-namespaced module. For instanceaboutcode_pipeline/__init__.py
There are no specific benefits nor issues with the second approach. Using the first approach with a namespace would be the pythonic way.
We should name the wheel aboutcode.pipeline
to make it unique.
This PR is already too complex combining the module refactoring, the Pipeline classes refactoring, packaging logic, and new CI for release publishing. Let's split this into manageable and simpler-to-review parts:
|
This PR refactors the pipelines architecture to extract a reusable BasePipeline in its own module and also available as standalone library
This is going to support adopting pipelines in VulnerableCode and PurlDB
Reference: