A script to generate an SPDX-format Software Bill of Materials (SBOM) for the linux kernel build.
The eventual goal is to integrate the sbom/ directory into the linux/scripts/ directory in the official linux source tree.
- Provide a linux source and output tree, e.g., by downloading precompiled test data from KernelSbom-TestData
or cloning the linux repo and building your own config
test_archive="linux.v6.17.tinyconfig.x86.tar.gz" curl -L -o "$test_archive" "https://fileshare.tngtech.com/d/e69946da808b41f88047/files/?p=%2F$test_archive&dl=1" tar -xzf "$test_archive"
git clone --depth 1 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git cd linux make <config> O=kernel_build make -j$(nproc) O=kernel_build
- Clone the repository
git clone [email protected]:TNG/KernelSbom.git cd KernelSbom - Run the sbom.py script
export SRCARCH=x86 python3 sbom/sbom.py \ --src-tree ../linux \ --output-tree ../linux/kernel_build \ --roots arch/x86/boot/bzImage \ --spdx \ --used-files \ --prettify-json
Starting from the provided root artifact (bzImage), the script constructs a cmd graph: a directed acyclic graph whose nodes are filenames and whose edges represent build dependencies extracted from the corresponding .<filename>.cmd files.
Using this cmd graph, the script generates three SPDX documents and writes them to disk:
sbom-source.spdx.json— Describes all source files in the source tree that contributed to building the provided root artifacts (bzImage).sbom-build.spdx.json— Describes all build artifacts and the process by which they were built from the sources insbom-source.spdx.json.sbom-output.spdx.json— Describes the final build outputs, i.e., the provided root artifacts.
If the --used-files flag is enabled, the script also produces sbom.used-files.txt, a flat list of all source files in sbom-source.spdx.json.
Note: If the source tree and output tree are identical, reliably distinguishing source files is not possible. In this case, the source SPDX document is merged into sbom-build.spdx.json, and sbom.used-files.txt contains all files from sbom-build.spdx.json.
To include .ko kernel modules in the provided root artifacts, you can use the helper script below to generate a roots.txt file:
echo "arch/x86/boot/bzImage" >> roots.txt
python3 sbom_analysis/module_roots.py <output_tree>/modules.order >> roots.txtThen pass the roots file to the main script:
export SRCARCH=x86
python3 sbom/sbom.py \
--src-tree ../linux \
--output-tree ../linux/kernel_build \
--roots-file roots.txt \
--spdx \
--used-files \
--prettify-jsonThe following diagrams illustrate the structure of the generated SPDX documents: sbom-source.spdx.json, sbom-build.spdx.json, and sbom-output.spdx.json.
flowchart TD
%% SHARED ELEMENTS
AGENT["SoftwareAgent"]
CREATION_INFO["CreationInfo"]
CREATION_INFO -->|createdBy| AGENT
%% SPDX DOCUMENTS
subgraph SOURCE_GRAPH["sbom-source.spdx.json"]
SOURCE_DOC["SpdxDocument"]
SOURCE_SBOM["Sbom"]
SOURCE_TREE["File (src_tree)"]
MAINC["File (init/main.c)"]
GPL2ONLY_LICENSEEXPRESSION["LicenseExpression (GPL-2.0-only)"]
SOURCE_DOC -->|rootElement| SOURCE_SBOM
SOURCE_SBOM -->|rootElement| SOURCE_TREE
SOURCE_SBOM -->|element| SOURCE_TREE
SOURCE_SBOM -->|element| MAINC
SOURCE_SBOM -->|element| GPL2ONLY_LICENSEEXPRESSION
SOURCE_TREE -->|contains| MAINC
MAINC -->|hasDeclaredLicense| GPL2ONLY_LICENSEEXPRESSION
end
subgraph BUILD_GRAPH["sbom-build.spdx.json"]
BUILD_DOC["SpdxDocument"]
BUILD_SBOM["Sbom"]
OUTPUT_TREE["File (output_tree)"]
VMLINUX_BIN["File (arch/x86/boot/vmlinux.bin)"]
BZIMAGE["File (arch/x86/boot/bzImage)"]
DOTDOT["..."]
MAINC_EXTERNALMAP["ExternalMap (init/main.c)"]
RUSTLIB["File (sources outside of src tree, e.g., rustlib/src/rust/library/core/src/lib.rs)"]
BUILD_DOC -->|rootElement| BUILD_SBOM
BUILD_DOC -->|import| MAINC_EXTERNALMAP
BUILD_SBOM -->|rootElement| OUTPUT_TREE
BUILD_SBOM -->|element| OUTPUT_TREE
BUILD_SBOM -->|element| RUSTLIB
BUILD_SBOM -->|element| VMLINUX_BIN
BUILD_SBOM -->|element| BZIMAGE
OUTPUT_TREE -->|contains| VMLINUX_BIN
OUTPUT_TREE -->|contains| BZIMAGE
RUSTLIB -->|Build| DOTDOT
DOTDOT -->|Build| VMLINUX_BIN
VMLINUX_BIN -->|Build| BZIMAGE
end
MAINC -->|Build| DOTDOT
subgraph OUTPUT_GRAPH["sbom-output.spdx.json"]
OUTPUT_DOC["SpdxDocument"]
OUTPUT_SBOM["Sbom"]
PACKAGE["Package (Linux Kernel)"]
PACKAGE_LICENSEEXPRESSION["LicenseExpression (GPL-2.0 WITH Linux-syscall-note)"]
BZIMAGE_COPY["File (Copy) (arch/x86/boot/bzImage)"]
BZIMAGE_EXTERNALMAP["ExternalMap (arch/x86/boot/bzImage)"]
style BZIMAGE_COPY stroke-dasharray: 5 5
OUTPUT_DOC -->|rootElement| OUTPUT_SBOM
OUTPUT_DOC -->|import| BZIMAGE_EXTERNALMAP
OUTPUT_SBOM -->|rootElement| PACKAGE
OUTPUT_SBOM -->|element| PACKAGE
OUTPUT_SBOM -->|element| BZIMAGE
OUTPUT_SBOM -->|element| PACKAGE_LICENSEEXPRESSION
PACKAGE -->|contains| BZIMAGE
PACKAGE -->|hasDeclaredLicense| PACKAGE_LICENSEEXPRESSION
end
PACKAGE -->|originatedBy| AGENT
flowchart TD
%% SHARED ELEMENTS
AGENT["SoftwareAgent"]
CREATION_INFO["CreationInfo"]
CREATION_INFO -->|createdBy| AGENT
%% SPDX DOCUMENTS
subgraph BUILD_GRAPH["sbom-build.spdx.json"]
BUILD_DOC["SpdxDocument"]
BUILD_SBOM["Sbom"]
MAINC["File (init/main.c)"]
GPL2ONLY_LICENSEEXPRESSION["LicenseExpression (GPL-2.0-only)"]
VMLINUX_BIN["File (arch/x86/boot/vmlinux.bin)"]
BZIMAGE["File (arch/x86/boot/bzImage)"]
DOTDOT["..."]
RUSTLIB["File (sources outside of src tree, e.g., rustlib/src/rust/library/core/src/lib.rs)"]
BUILD_DOC -->|rootElement| BUILD_SBOM
BUILD_SBOM -->|rootElement| BZIMAGE
BUILD_SBOM -->|element| RUSTLIB
BUILD_SBOM -->|element| MAINC
BUILD_SBOM -->|element| GPL2ONLY_LICENSEEXPRESSION
BUILD_SBOM -->|element| VMLINUX_BIN
BUILD_SBOM -->|element| BZIMAGE
RUSTLIB -->|Build| DOTDOT
DOTDOT -->|Build| VMLINUX_BIN
VMLINUX_BIN -->|Build| BZIMAGE
MAINC -->|hasDeclaredLicense| GPL2ONLY_LICENSEEXPRESSION
end
MAINC -->|Build| DOTDOT
subgraph OUTPUT_GRAPH["sbom-output.spdx.json"]
OUTPUT_DOC["SpdxDocument"]
OUTPUT_SBOM["Sbom"]
PACKAGE["Package (Linux Kernel)"]
PACKAGE_LICENSEEXPRESSION["LicenseExpression (GPL-2.0 WITH Linux-syscall-note)"]
BZIMAGE_COPY["File (Copy) (arch/x86/boot/bzImage)"]
BZIMAGE_EXTERNALMAP["ExternalMap (arch/x86/boot/bzImage)"]
style BZIMAGE_COPY stroke-dasharray: 5 5
OUTPUT_DOC -->|rootElement| OUTPUT_SBOM
OUTPUT_DOC -->|import| BZIMAGE_EXTERNALMAP
OUTPUT_SBOM -->|rootElement| PACKAGE
OUTPUT_SBOM -->|element| PACKAGE
OUTPUT_SBOM -->|element| BZIMAGE
OUTPUT_SBOM -->|element| PACKAGE_LICENSEEXPRESSION
PACKAGE -->|contains| BZIMAGE
PACKAGE -->|hasDeclaredLicense| PACKAGE_LICENSEEXPRESSION
end
PACKAGE -->|originatedBy| AGENT
sbom/sbom.py- The main script responsible for generating the SBOMsbom/lib/sbom/- Library modules used by the main scriptsbom/lib/sbom_tests/- Unit tests for the library modules
sbom_analysis/- Additional scripts for analyzing the outputs produced by the main script.- sbom_analysis/cmd_graph_based_kernel_build/ - Validation of cmd graph completeness by rebuilding the linux kernel only with files referenced in the cmd graph.
- sbom_analysis/cmd_graph_visualization/ - Interactive visualization of the cmd graph
testdata_generation/- Describes how the precompiled kernel builds in KernelSbom-TestData were generated.
The main contribution of this repository is the content of the sbom directory which eventually should be moved into the linux/scripts/ directory in the official linux source tree.
Activate the venv and install build dependencies:
python3 -m venv .venv
source .venv/bin/activate
pip install pre-commit reuse ruff
pre-commit installWhen committing, reuse lint is run as a pre-commit hook to ensure all files have compliant license headers.
If any file is missing a license header, it can be added using:
reuse annotate --license="GPL-2.0-only" --copyright="TNG Technology Consulting GmbH" --template default <filename>
Note: If the annotated file contains a shebang,
reuse annotatewill insert an empty line after it. This empty line must be removed manually.