Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

race condition opportunities in writing tools and verification tags #477

Open
mikn opened this issue Mar 6, 2025 · 1 comment · May be fixed by #478
Open

race condition opportunities in writing tools and verification tags #477

mikn opened this issue Mar 6, 2025 · 1 comment · May be fixed by #478

Comments

@mikn
Copy link
Contributor

mikn commented Mar 6, 2025

Hi!

As you may know, we have created an unholy beast by wrapping twoliter in Bazel. An implementation that I am sure makes even you, perhaps ask "why".

A fun thing that Bazel does, is that if you have several inputs to a rule where the dependency graph has put them firmly parallel to each other, it will build those in parallel, whether you want Bazel to or not. (It is incredibly difficult to serialize actions in Bazel based on an engineer's opinion)

This has put me in the weird position of having found a couple of race issues in twoliter, and also suffering from their existence. Here, race issue means "if I for some stupid reason start more than one twoliter process, this will break".

The issues are two, one is writing out the tools/ folder. I understand that you do want to do this every time you run twoliter to avoid a host of "this is broken on my machine". I ended up implementing a file lock for writing it out to avoid the odd double-remove race condition.

For the second one, which is removing the verification tags for the SDKs, I ended up with a bit larger implementation - this implementation is that I am actually hashing the contents of the verification tags and comparing before removing/writing out again. I also found that in a few places you gratuitously remove the verification tags several times instead of just writing them out again, perhaps squarely to stop me from doing what I am doing. It happens as you can see in this log excerpt:

[2025-03-06T14:51:51Z WARN  twoliter__bin::project] A Release.toml file was found. Release.toml is deprecated. Please remove it from your project.
[2025-03-06T14:51:51Z DEBUG twoliter__bin::project::lock::verification] Cleaning up any existing tag files for resolved artifacts
[2025-03-06T14:51:51Z DEBUG twoliter__bin::project] Project file loaded from '/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket/Twoliter.toml'
[2025-03-06T14:51:51Z DEBUG twoliter__bin::project::lock::verification] Cleaning up any existing tag files for resolved artifacts
[2025-03-06T14:51:51Z INFO  twoliter__bin::project::lock] Resolving project references to check against lock file
[2025-03-06T14:51:51Z DEBUG twoliter__bin::project::lock] Loading existing lockfile '/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket/Twoliter.lock'
[2025-03-06T14:51:51Z DEBUG twoliter__bin::project::lock] Resolving kit 'bottlerocket-core-kit' image=bottlerocket-core-kit-5.4.2@public.ecr.aws/bottlerocket/bottlerocket-core-kit:v5.4.2
[2025-03-06T14:51:51Z INFO  twoliter__bin::project::lock::image] Resolving dependency image dependency 'bottlerocket-core-kit-5.4.2@public.ecr.aws/bottlerocket/bottlerocket-core-kit:v5.4.2'.
[2025-03-06T14:51:51Z DEBUG twoliter__bin::project::lock::image] Fetching image manifest. image=bottlerocket-core-kit-5.4.2@public.ecr.aws/bottlerocket/bottlerocket-core-kit:v5.4.2 uri="public.ecr.aws/bottlerocket/bottlerocket-core-kit:v5.4.2"
[2025-03-06T14:51:53Z DEBUG twoliter__bin::project::lock::image] Calculated digest for locked image 'public.ecr.aws/bottlerocket/bottlerocket-core-kit:v5.4.2': '08DMHssgSYSdDW2RPw8Oxy+8/jsNA76sCRW0kJG8XNE='
[2025-03-06T14:51:53Z DEBUG twoliter__bin::project::lock::image] Extracting kit metadata from OCI image
[2025-03-06T14:51:56Z DEBUG twoliter__bin::project::lock] Resolving kit 'bottlerocket-kernel-kit' image=bottlerocket-kernel-kit-1.0.6@public.ecr.aws/bottlerocket/bottlerocket-kernel-kit:v1.0.6
[2025-03-06T14:51:56Z INFO  twoliter__bin::project::lock::image] Resolving dependency image dependency 'bottlerocket-kernel-kit-1.0.6@public.ecr.aws/bottlerocket/bottlerocket-kernel-kit:v1.0.6'.
[2025-03-06T14:51:56Z DEBUG twoliter__bin::project::lock::image] Fetching image manifest. image=bottlerocket-kernel-kit-1.0.6@public.ecr.aws/bottlerocket/bottlerocket-kernel-kit:v1.0.6 uri="public.ecr.aws/bottlerocket/bottlerocket-kernel-kit:v1.0.6"
[2025-03-06T14:51:58Z DEBUG twoliter__bin::project::lock::image] Calculated digest for locked image 'public.ecr.aws/bottlerocket/bottlerocket-kernel-kit:v1.0.6': 'xnixe91yqOdD0iDPSwa0fKnAaCDzU6nwzw1XL+3pbyI='
[2025-03-06T14:51:58Z DEBUG twoliter__bin::project::lock::image] Extracting kit metadata from OCI image
[2025-03-06T14:52:02Z DEBUG twoliter__bin::project::lock] Resolving workspace SDK sdk_set={ProjectImage { image: Image { name: ValidIdentifier("bottlerocket-sdk"), version: Version { major: 0, minor: 50, patch: 1 }, vendor: ValidIdentifier("bottlerocket") }, vendor: Verbatim(VerbatimVendor { vendor_name: ValidIdentifier("bottlerocket"), vendor: Vendor { registry: "public.ecr.aws/bottlerocket" } }) }}
[2025-03-06T14:52:02Z DEBUG twoliter__bin::project::lock] Resolving workspace SDK sdk=ProjectImage { image: Image { name: ValidIdentifier("bottlerocket-sdk"), version: Version { major: 0, minor: 50, patch: 1 }, vendor: ValidIdentifier("bottlerocket") }, vendor: Verbatim(VerbatimVendor { vendor_name: ValidIdentifier("bottlerocket"), vendor: Vendor { registry: "public.ecr.aws/bottlerocket" } }) }
[2025-03-06T14:52:02Z INFO  twoliter__bin::project::lock::image] Resolving dependency image dependency 'bottlerocket-sdk-0.50.1@public.ecr.aws/bottlerocket/bottlerocket-sdk:v0.50.1'.
[2025-03-06T14:52:02Z DEBUG twoliter__bin::project::lock::image] Fetching image manifest. image=bottlerocket-sdk-0.50.1@public.ecr.aws/bottlerocket/bottlerocket-sdk:v0.50.1 uri="public.ecr.aws/bottlerocket/bottlerocket-sdk:v0.50.1"
[2025-03-06T14:52:04Z DEBUG twoliter__bin::project::lock::image] Calculated digest for locked image 'public.ecr.aws/bottlerocket/bottlerocket-sdk:v0.50.1': 'HEh3Lx3F6P4OEPnFubF++RMpMW2vlfp/Tc/tGjnBRcM='
[2025-03-06T14:52:04Z DEBUG twoliter__bin::project::lock] Comparing resolved lock to current lock state current_lock=Lock { schema_version: 1, sdk: LockedImage { name: ValidIdentifier("bottlerocket-sdk"), version: Version { major: 0, minor: 50, patch: 1 }, vendor: ValidIdentifier("bottlerocket"), source: "public.ecr.aws/bottlerocket/bottlerocket-sdk:v0.50.1", digest: "HEh3Lx3F6P4OEPnFubF++RMpMW2vlfp/Tc/tGjnBRcM=" }, kit: [LockedImage { name: ValidIdentifier("bottlerocket-core-kit"), version: Version { major: 5, minor: 4, patch: 2 }, vendor: ValidIdentifier("bottlerocket"), source: "public.ecr.aws/bottlerocket/bottlerocket-core-kit:v5.4.2", digest: "08DMHssgSYSdDW2RPw8Oxy+8/jsNA76sCRW0kJG8XNE=" }, LockedImage { name: ValidIdentifier("bottlerocket-kernel-kit"), version: Version { major: 1, minor: 0, patch: 6 }, vendor: ValidIdentifier("bottlerocket"), source: "public.ecr.aws/bottlerocket/bottlerocket-kernel-kit:v1.0.6", digest: "xnixe91yqOdD0iDPSwa0fKnAaCDzU6nwzw1XL+3pbyI=" }] } resolved_lock=Lock { schema_version: 1, sdk: LockedImage { name: ValidIdentifier("bottlerocket-sdk"), version: Version { major: 0, minor: 50, patch: 1 }, vendor: ValidIdentifier("bottlerocket"), source: "public.ecr.aws/bottlerocket/bottlerocket-sdk:v0.50.1", digest: "HEh3Lx3F6P4OEPnFubF++RMpMW2vlfp/Tc/tGjnBRcM=" }, kit: [LockedImage { name: ValidIdentifier("bottlerocket-core-kit"), version: Version { major: 5, minor: 4, patch: 2 }, vendor: ValidIdentifier("bottlerocket"), source: "public.ecr.aws/bottlerocket/bottlerocket-core-kit:v5.4.2", digest: "08DMHssgSYSdDW2RPw8Oxy+8/jsNA76sCRW0kJG8XNE=" }, LockedImage { name: ValidIdentifier("bottlerocket-kernel-kit"), version: Version { major: 1, minor: 0, patch: 6 }, vendor: ValidIdentifier("bottlerocket"), source: "public.ecr.aws/bottlerocket/bottlerocket-kernel-kit:v1.0.6", digest: "xnixe91yqOdD0iDPSwa0fKnAaCDzU6nwzw1XL+3pbyI=" }] }
[2025-03-06T14:52:04Z DEBUG twoliter__bin::project::lock::verification] Tag file '/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket/build/external-kits/.sdk-verified' unchanged, skipping
[2025-03-06T14:52:04Z DEBUG twoliter__bin::project::lock::verification] Tag file '/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket/build/external-kits/.kits-verified' unchanged, skipping
[2025-03-06T14:52:04Z DEBUG twoliter__bin::tools] Installing tools to '/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket/build/tools'
[2025-03-06T14:52:04Z DEBUG twoliter__bin::tools] Installed tools to '/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket/build/tools.tmp.b6cc243d-9d76-4390-a9fe-9809ce853e5f'
[2025-03-06T14:52:04Z DEBUG twoliter__bin::tools] Successfully installed tools to '/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket/build/tools'
[2025-03-06T14:52:04Z DEBUG twoliter__bin::common] Running: Command { std: "cargo" "make" "--disable-check-for-updates" "--makefile" "/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket/build/tools/Makefile.toml" "--cwd" "/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket" "-e=TLPRIVATE_SDK_IMAGE=public.ecr.aws/bottlerocket/bottlerocket-sdk:v0.50.1" "-e=BUILDSYS_OUTPUT_GENERATION_ID=1" "-e=TWOLITER_TOOLS_DIR=/home/mikn/.cache/bazel/_bazel_mikn/2c3f33a86f4a406b8ffa9b29662db437/execroot/_main/platform/base/bottlerocket/build/tools" "-e=BUILDSYS_ARCH=x86_64" "-e=BUILDSYS_VARIANT=metal-dev" "-e=BUILDSYS_VERSION_IMAGE=1.1.0" "-e=GO_MODULES=" "-e=BUILDSYS_UPSTREAM_SOURCE_FALLBACK=false" "build", kill_on_drop: false }
Twoliter could not validate external kits, refusing to continue

There was a third call to ::project::lock::verification originally, but this was part through my 'fix', it does however still paint a pretty good picture of it being called multiple times (removing the verification tags) a bit redundantly.

I will attach a PR that contains these fixes for your amusement. We are carrying this patch for now (also hence the binary is called twoliter__bin, an artefact of building it using Bazel).

@mikn mikn linked a pull request Mar 6, 2025 that will close this issue
@mikn
Copy link
Contributor Author

mikn commented Mar 6, 2025

Overall though, I am thoroughly surprised at how well it works to run twoliter in parallel. I would've bet money on that the docker stuff would've blown up in my face, so I am pleasantly surprised this was all I needed to do to make it work. :)

That said, in our implementation, we only build the actual images of each variant as we offload all package building to another process that builds the kit itself. I imagine starting 4 twoliter processes against the same source tree including overlapping package builds would've ended in disaster..?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant