Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
348 commits
Select commit Hold shift + click to select a range
fb8ed9c
ImageName and ImageURL are both optional.
guettli Sep 3, 2025
fe2f070
fix typo.
guettli Sep 3, 2025
a2e20f1
make linter happy.
guettli Sep 3, 2025
7da2aa2
make imageName really optional. Up to now min-size check failed.
guettli Sep 3, 2025
5b256c0
wip.
guettli Sep 4, 2025
0c7f639
wip.
guettli Sep 4, 2025
fede054
make linter happy.
guettli Sep 4, 2025
3c81fdc
add unit tests for getSSHKeys method
Dhairya-Arora01 Sep 4, 2025
475ae98
fixup! add unit tests for getSSHKeys method
Dhairya-Arora01 Sep 4, 2025
0785fbf
fixup! fixup! add unit tests for getSSHKeys method
Dhairya-Arora01 Sep 4, 2025
fec5ed3
do not overwrite BootState, otherwise imageUrl does not work.
guettli Sep 4, 2025
8131ae6
Still WIP, but "i am happy was reached"
guettli Sep 5, 2025
24789ed
fixup! fixup! fixup! add unit tests for getSSHKeys method
Dhairya-Arora01 Sep 5, 2025
8ff88da
fixup! fixup! fixup! fixup! add unit tests for getSSHKeys method
Dhairya-Arora01 Sep 5, 2025
baf60cc
refactor to actions done.
guettli Sep 5, 2025
060d7a1
fixup! fixup! fixup! fixup! fixup! add unit tests for getSSHKeys method
Dhairya-Arora01 Sep 5, 2025
b1a91d5
faster deploy.
guettli Sep 5, 2025
a165d45
adjust timing.
guettli Sep 5, 2025
3b92f08
fixup! fixup! fixup! fixup! fixup! fixup! add unit tests for getSSHKe…
Dhairya-Arora01 Sep 5, 2025
81c121a
differentiate between preRescueOS and rescue system via hostname.
guettli Sep 8, 2025
8d531b8
works, except that machine does start. Could be an cloud-init issue.
guettli Sep 9, 2025
a4b73b6
provide machine name to script.
guettli Sep 9, 2025
c084d5e
typo.
guettli Sep 9, 2025
7d2b5bb
add comments.
guettli Sep 9, 2025
77b008e
docs.
guettli Sep 9, 2025
de37d71
fixup! fixup! fixup! fixup! fixup! fixup! fixup! add unit tests for g…
Dhairya-Arora01 Sep 9, 2025
f34029f
fix typo.
guettli Sep 10, 2025
d23d320
Merge branch 'main' into update-controller-runtime
guettli Sep 10, 2025
465052f
removed conditions.MarkFalse(m.Machine, clusterv1.BootstrapReadyCondi…
guettli Sep 10, 2025
ee29b07
do not update caph, if oidc@ is in the kubectl context.
guettli Sep 10, 2025
fb7cf38
update check-conditions.
guettli Sep 10, 2025
0d43d71
run `make generate`
guettli Sep 10, 2025
b1eead4
update from tg/create-hcloud-machines-via-oci-node-image
guettli Sep 10, 2025
2791f0e
make install-ccm-in-wl-cluster work.
guettli Sep 10, 2025
e84e1e0
patch cilium via helm values.
guettli Sep 10, 2025
fcc8197
less lines in `make watch` output.
guettli Sep 10, 2025
601a8ac
make linter happy.
guettli Sep 10, 2025
566e94d
fix yaml linter warnings.
guettli Sep 10, 2025
c9d7a3e
SkipNameValidation
guettli Sep 10, 2025
c8451ec
show timestamp in start/end logging status.
guettli Sep 10, 2025
ed3e6be
DumpAllResources needs KubeConfigPath.
guettli Sep 10, 2025
6abc66e
log wl-cluster conditions.
guettli Sep 10, 2025
6f2b247
remove useless else.
guettli Sep 10, 2025
5471676
provide ClusterctlConfigPath
guettli Sep 10, 2025
9c06ecd
fix: input.ClusterctlConfigPath is required for WaitForClusterDeleted
guettli Sep 10, 2025
230dbd7
show IPv4 of hbmh, better visual start of logging, log less lines.
guettli Sep 11, 2025
28b4810
fix typo, add comment to most usefull makefile target.
guettli Sep 11, 2025
0e20076
show error message, if getting cloud-init output failed. Remove handl…
guettli Sep 11, 2025
e2150eb
If GetCloudInitOutput() hits "no such file or directoy", then show th…
guettli Sep 11, 2025
d11f21e
do not overwrite real error message with "status: error", when callin…
guettli Sep 11, 2025
21bb603
avoid `&&` in bash scripts. Use strict mode.
guettli Sep 11, 2025
da14efc
avoid `&&` and use bash strict mode II
guettli Sep 11, 2025
201e92e
more avoid &&
guettli Sep 11, 2025
83a0be2
more avoid '&&'
guettli Sep 11, 2025
f97a954
(file was not saved)
guettli Sep 11, 2025
fa6717e
fix: error during envsubst: Missing env var LINENO.
guettli Sep 11, 2025
6147777
skip, if `apt update` fails.
guettli Sep 11, 2025
4c27291
update ccm.
guettli Sep 11, 2025
6e4e132
hetzner issue with ubuntu mirror seems to be fixed.
guettli Sep 11, 2025
fcc3bb8
Merge remote-tracking branch 'origin/main' into tg/create-hcloud-mach…
guettli Sep 11, 2025
e85c4cf
revert changes to Makefile install-cilium-in-wl-cluster.sh (not needed)
guettli Sep 12, 2025
db6d095
revert custom capierrors package. We need to care for that, when we u…
guettli Sep 12, 2025
0e0ba08
moved changes in hack dir to new PR:
guettli Sep 12, 2025
8e0a711
:seedling: Deprecate (ssh) PortAfterCloudInit.
guettli Sep 12, 2025
90fbfb8
do not change vscode color.
guettli Sep 12, 2025
6a35625
fix tests.
guettli Sep 12, 2025
fb03843
differentiate between two errors.
guettli Sep 12, 2025
0b97bf3
fix linter warning.
guettli Sep 12, 2025
ec2a9d9
revert vscode setting change.
guettli Sep 12, 2025
e1f8429
Merge remote-tracking branch 'origin/tg/deprecate-PortAfterCloudInit'…
guettli Sep 12, 2025
7fd37fd
make linter happy.
guettli Sep 12, 2025
0696478
Merge branch 'main' into tg/refactor-hcloud-provisioning
guettli Sep 12, 2025
0e39c8f
fix broken test in hcloudmachine_validation_test.go
guettli Sep 12, 2025
7b08ec8
more explicit comment.
guettli Sep 12, 2025
a70543d
comment for SetError().
guettli Sep 12, 2025
6952f3e
from Branch tg/create-hcloud-machines-via-oci-node-image
guettli Sep 12, 2025
c389167
Merge branch 'tg/refactor-hcloud-provisioning' into tg/create-hcloud-…
guettli Sep 12, 2025
21b859d
fix errors from previous merge.
guettli Sep 12, 2025
61803c1
remove comment.
guettli Sep 12, 2025
6c0c796
pick test changes and shorter comment from Dhairya last commit (de37d…
guettli Sep 12, 2025
0a5334c
Merge remote-tracking branch 'origin/main' into tg/refactor-hcloud-pr…
guettli Sep 12, 2025
5e79c2f
Merge branch 'main' into update-controller-runtime
guettli Sep 12, 2025
ab7315c
go mod tidy.
guettli Sep 12, 2025
f398114
removed BootStateMessage.
guettli Sep 12, 2025
0c852d8
inline requeue constants.
guettli Sep 12, 2025
62049f7
use if/else
guettli Sep 12, 2025
edb005b
make linter happy.
guettli Sep 12, 2025
194c5ba
Merge branch 'main' into tg/refactor-hcloud-provisioning
guettli Sep 12, 2025
ac42995
Merge branch 'main' into update-controller-runtime
guettli Sep 12, 2025
2c21024
Merge branch 'tg/refactor-hcloud-provisioning' into tg/create-hcloud-…
guettli Sep 12, 2025
f610486
Merge branch 'main' into update-controller-runtime
guettli Sep 13, 2025
f50f854
Merge branch 'main' into tg/refactor-hcloud-provisioning
guettli Sep 13, 2025
a9b1aad
Merge branch 'tg/refactor-hcloud-provisioning' into tg/create-hcloud-…
guettli Sep 13, 2025
6485e75
add comment that a not found is fine.
guettli Sep 13, 2025
ac7e4fd
removed duplicate line, use Logger from Scope.
guettli Sep 13, 2025
24babdf
use klog.KObj (which duplicates the namespace in the output, but it i…
guettli Sep 13, 2025
f52e763
Merge branch 'tg/refactor-hcloud-provisioning' into tg/create-hcloud-…
guettli Sep 13, 2025
cc916fd
remove constants, use durations inline.
guettli Sep 13, 2025
5aa2564
fix linter typo.
guettli Sep 13, 2025
ec61302
added comment: // We found a server, now update the Status. We do th…
guettli Sep 15, 2025
c9932ef
call updateHCloudMachineStatusFromServer() again and again in each ha…
guettli Sep 15, 2025
d167a1d
call updateHCloudMachineStatusFromServer() in handleBootState functions.
guettli Sep 15, 2025
8d7a944
use var `hm`.
guettli Sep 15, 2025
eb76eb6
Merge branch 'tg/refactor-hcloud-provisioning' into tg/create-hcloud-…
guettli Sep 15, 2025
86a6c27
download newer hcloud cli tool.
guettli Sep 15, 2025
a2123e0
Merge branch 'main' into update-controller-runtime
guettli Sep 16, 2025
2f9a20a
Merge branch 'main' into tg/refactor-hcloud-provisioning
guettli Sep 16, 2025
f2d9ecc
Merge branch 'tg/refactor-hcloud-provisioning' into tg/create-hcloud-…
guettli Sep 16, 2025
81c20eb
IMAGE_INSTALL_DONE --> IMAGE_URL_DONE
guettli Sep 16, 2025
a4e24cf
Merge branch 'main' into update-controller-runtime
guettli Sep 16, 2025
0bb1dfb
fix linter error about capierrors being deprecated.
guettli Sep 16, 2025
69dc21f
Merge branch 'main' into tg/create-hcloud-machines-via-oci-node-image
guettli Sep 16, 2025
96d3326
refactored states, so that we can timeout the image-url-command.
guettli Sep 16, 2025
aecdc31
fixed typo.
guettli Sep 16, 2025
875e5c5
set warning, not info condition, when SetError gets used. Timeout of …
guettli Sep 16, 2025
d99e989
create conditons and events.
guettli Sep 16, 2025
a7da3fd
update docs (steps) of ImageURLCommand.
guettli Sep 16, 2025
2180806
docs for --hcloud-image-url-command updated.
guettli Sep 16, 2025
36374cf
fix typo.
guettli Sep 16, 2025
80a034e
added missing error handling.
guettli Sep 16, 2025
8843b38
clean up docstrings.
guettli Sep 16, 2025
bc58706
make linter happy.
guettli Sep 16, 2025
ec12eac
ensure controller-gen is in the correct version.
guettli Sep 17, 2025
1b1d7d1
better comment for SkipNameValidation.
guettli Sep 17, 2025
65afc91
use hbmh and update generated files.
guettli Sep 17, 2025
5813112
added comments to make async api of hcloud more clear.
guettli Sep 17, 2025
f064636
use Logger from scope.
guettli Sep 17, 2025
2e0a2ce
set condition, when RobotRescueSecretRef.Name is empty.
guettli Sep 17, 2025
9f5ebbd
set condition if no IP address exists.
guettli Sep 17, 2025
42f5b45
do not reboot in EnableRescueSystem().
guettli Sep 17, 2025
7546360
re-generated manifests with controller-gen.
guettli Sep 17, 2025
2e6828c
Merge branch 'main' into update-controller-runtime
guettli Sep 17, 2025
11c0eec
fix one char typo in last merge commit.
guettli Sep 17, 2025
78236eb
Merge branch 'update-controller-runtime' into tg/create-hcloud-machin…
guettli Sep 17, 2025
9ace7f7
typo in boot state.
guettli Sep 17, 2025
5c0a067
Merge remote-tracking branch 'origin/main' into tg/create-hcloud-mach…
guettli Sep 17, 2025
e78e2b2
do pre-flight check, to avoid using hcloud-api, when config for provi…
guettli Sep 18, 2025
1eaa292
avoid term "hcloud" in sshClient (image-url-command will be used for …
guettli Sep 18, 2025
1645895
make imageURLCommand private.
guettli Sep 18, 2025
ecd039f
make linter happy.
guettli Sep 18, 2025
25cdb55
watch output: show ip.
guettli Sep 18, 2025
e85b509
fix bug, started test for new way.
guettli Sep 18, 2025
06c1b89
test are fine, but not finished yet.
guettli Sep 18, 2025
33a3e51
test fine, but not finished.
guettli Sep 18, 2025
c7edcc7
test works, but not finished.
guettli Sep 18, 2025
f54285e
test fine and finished.
guettli Sep 18, 2025
9419a5c
fixed test.
guettli Sep 18, 2025
45715ff
FIt() to It().
guettli Sep 18, 2025
2582424
removed not needed code.
guettli Sep 19, 2025
2d60a24
aligned docs to actual states (if imageURL)
guettli Sep 19, 2025
19e98ee
typo in err msg.
guettli Sep 19, 2025
531c13e
"waiting for rescue system to be enabled"
guettli Sep 19, 2025
60297af
removed changes to one func, add comment to next.
guettli Sep 19, 2025
34c4247
use shorter names for BootStates.
guettli Sep 19, 2025
edcaf4e
:seedling: Provision baremetal via --baremetal-image-url-command
guettli Sep 19, 2025
7f4612d
create condition, if getting server image failed.
guettli Sep 19, 2025
2f7c035
reboot to rescue started
guettli Sep 19, 2025
14e63d7
remove todo.
guettli Sep 19, 2025
102322d
better docu for second reboot, if first failed.
guettli Sep 19, 2025
c5d1a9a
this can happen if old caph gets active again.
guettli Sep 19, 2025
29e3eb9
simplify error handling which should not happen anymore.
guettli Sep 19, 2025
95ceaa4
check if reboot was lost.
guettli Sep 22, 2025
4b5324c
reboot timeout: Use action finished timestamp to calculate timeout.
guettli Sep 22, 2025
f1c993e
round duration to seconds.
guettli Sep 22, 2025
61534bd
reboot via ssh. Avoid hcloud api calls (rate-limit)
guettli Sep 22, 2025
0af85b6
better err msg: avoid strange connection refused message.
guettli Sep 22, 2025
450e237
remove typo.
guettli Sep 22, 2025
5a960a2
more space.
guettli Sep 22, 2025
16068da
renamed failed state
guettli Sep 22, 2025
f4b344c
more new lines.
guettli Sep 22, 2025
c41d455
added hint that this error should never happen.
guettli Sep 23, 2025
cef0d20
Merge branch 'tg/create-hcloud-machines-via-oci-node-image' into tg/p…
guettli Sep 23, 2025
6c03dcf
one typo. two "make linter happy"
guettli Sep 23, 2025
7ae6688
Merge branch 'tg/create-hcloud-machines-via-oci-node-image' into tg/p…
guettli Sep 23, 2025
e6abd0d
remove usage of `portAfterCloudInit` (deprecated.)
guettli Sep 23, 2025
310e44a
coding done. Not tested yet.
guettli Sep 23, 2025
32621b1
unit-tests work.
guettli Sep 23, 2025
34f93c4
use ...Once() instead of using ...Unset().
guettli Sep 24, 2025
03cd1ae
add timeout to every state.
guettli Sep 24, 2025
c5ff2a8
use `err` not `out.Err`, to avoid typos.
guettli Sep 24, 2025
9623927
unhappy path should use logger.Error()
guettli Sep 24, 2025
c0be565
syscall.ECONNREFUSED should be enough.
guettli Sep 24, 2025
96b85d0
use syscall.ECONNREFUSED, so that docs and LLM answers work.
guettli Sep 24, 2025
d8a1727
show logFile, if command timesout.
guettli Sep 24, 2025
4192b75
added docs.
guettli Sep 24, 2025
9b16a06
use Warn() not Warnf() if no string formatting gets done.
guettli Sep 24, 2025
cdc6981
boilerplate header for new python script.
guettli Sep 24, 2025
05fa486
make generate.
guettli Sep 24, 2025
e30dd0f
Merge branch 'tg/create-hcloud-machines-via-oci-node-image' into tg/p…
guettli Sep 24, 2025
cfa7810
fix typo.
guettli Sep 24, 2025
0470779
wait 10s in ImageURLCommandStateRunning, not 5s
guettli Sep 24, 2025
8c2bf5f
removed not needed logging.
guettli Sep 24, 2025
5bc698a
use actionFailed, not error.
guettli Sep 24, 2025
0bf9dca
check ImageURLCommand before using it.
guettli Sep 24, 2025
91b51a7
change interface to command: devices are needed ("sda sdb", if severa…
guettli Sep 24, 2025
a10f410
docs.
guettli Sep 24, 2025
48875b5
Hostname is needed, not Name of hbmh.
guettli Sep 25, 2025
40297cb
use ProvisionSucceededCondition
guettli Sep 25, 2025
65427c8
avoid conflict errors by waiting for write to be synced to the local …
guettli Sep 25, 2025
3d6d801
remove: retry reboot, if first has failed.
guettli Sep 26, 2025
43acf0a
removed extra check, which is unlikely to happen.
guettli Sep 26, 2025
1a0734d
Merge branch 'tg/create-hcloud-machines-via-oci-node-image' into tg/p…
guettli Sep 26, 2025
6b47487
removed condition
guettli Sep 26, 2025
94a50f4
Merge branch 'main' into tg/create-hcloud-machines-via-oci-node-image
guettli Sep 29, 2025
2f63f14
add Status.ExternalIDs
guettli Sep 29, 2025
fe162a9
make linter happy.
guettli Sep 29, 2025
a4de40f
use specific reasons, not string(hm.Status.BootState)
guettli Sep 29, 2025
4cdbf1e
Merge branch 'tg/create-hcloud-machines-via-oci-node-image' into tg/p…
guettli Sep 30, 2025
2531479
do not use string(ProvisioningState) for Reason of condition.
guettli Sep 30, 2025
19f4b03
round timeSinceReboot to seconds.
guettli Sep 30, 2025
3a533a1
use TimedOut, not Timedout
guettli Sep 30, 2025
254229e
changed Reason of condition.
guettli Sep 30, 2025
573b28e
removed condition which usually does not happen.
guettli Sep 30, 2025
38ee92e
wording.
guettli Sep 30, 2025
398eabe
removed RebootViaSSH timestamp.
guettli Sep 30, 2025
94eed28
Merge branch 'tg/create-hcloud-machines-via-oci-node-image' into tg/p…
guettli Sep 30, 2025
7e58a04
pull down changes from bm-PR.
guettli Sep 30, 2025
7c22e5d
pull down changes from bm-PR
guettli Sep 30, 2025
a31cd4b
adapt Mock.
guettli Sep 30, 2025
2733c5a
Merge branch 'tg/create-hcloud-machines-via-oci-node-image' into tg/p…
guettli Sep 30, 2025
edcf7b2
removed conditions.
guettli Sep 30, 2025
ea0f0b4
add comment.
guettli Sep 30, 2025
652a1a0
fix test: It("transitions to BootStateOperatingSystemRunning (imageURL)"
guettli Sep 30, 2025
1143fe4
remove comment.
guettli Sep 30, 2025
72f8508
Merge branch 'tg/create-hcloud-machines-via-oci-node-image' into tg/p…
guettli Sep 30, 2025
2392f2b
avoid duplicate text in condition.
guettli Oct 1, 2025
c929bd3
no requeue if image-url-command is missing.
guettli Oct 1, 2025
069cb32
extra handling of rate-limit exceeded.
guettli Oct 1, 2025
6d830d4
reduce RequeueAfter to 1min
guettli Oct 1, 2025
a10817f
added comment.
guettli Oct 1, 2025
f2e8dd8
fail if command is empty.
guettli Oct 1, 2025
054b858
shorter condition Reasons.
guettli Oct 1, 2025
ad01d55
added empty lines.
guettli Oct 1, 2025
958ae9e
check robot ssh in pre-flight-check.
guettli Oct 1, 2025
8293d4e
extracted getSSHPrivateKey(), use it in pre-flight-check.
guettli Oct 1, 2025
5f141b1
move pre-flight check to "initializing"
guettli Oct 1, 2025
69caa8e
add comments the hanlde funcs.
guettli Oct 1, 2025
f568dfc
Merge remote-tracking branch 'origin/main' into tg/create-hcloud-mach…
guettli Oct 1, 2025
41ae6ac
add comment.
guettli Oct 1, 2025
6cb3266
Merge branch 'main' into tg/create-hcloud-machines-via-oci-node-image
guettli Oct 1, 2025
42e9a48
Merge branch 'tg/create-hcloud-machines-via-oci-node-image' into tg/p…
guettli Oct 1, 2025
1ab7ff8
extraced IsLocalCacheUpToDate(), and write unit-test for it.
guettli Oct 1, 2025
274ad18
TimedOut
guettli Oct 1, 2025
eec07ae
more fixes.
guettli Oct 1, 2025
0d8cabf
added comment
guettli Oct 1, 2025
4b9e88b
typo
guettli Oct 1, 2025
cf6b9a3
Merge branch 'main' into tg/provision-bm-via-image-url-command
guettli Oct 6, 2025
d394cce
revert changes to hcloud/server/server.go (from merge request)
guettli Oct 6, 2025
6b4105f
fit() --> it().
guettli Oct 6, 2025
c0abd7d
feedback from PR review.
guettli Oct 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions api/v1beta1/hetznerbaremetalmachine_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,11 @@ type Image struct {
// URL defines the remote URL for downloading a tar, tar.gz, tar.bz, tar.bz2, tar.xz, tgz, tbz, txz image.
URL string `json:"url,omitempty"`

// UseCustomImageURLCommand makes the controller use the command provided by `--baremetal-image-url-command` instead of installimage.
// Docs: https://syself.com/docs/caph/developers/image-url-command
// +optional
UseCustomImageURLCommand bool `json:"useCustomImageURLCommand"`

// Name defines the archive name after download. This has to be a valid name for Installimage.
Name string `json:"name,omitempty"`

Expand All @@ -197,6 +202,9 @@ type Image struct {
// GetDetails returns the path of the image and whether the image has to be downloaded.
func (image Image) GetDetails() (imagePath string, needsDownload bool, errorMessage string) {
// If image is set, then the URL is also set and we have to download a remote file
if image.UseCustomImageURLCommand {
return "", false, "internal error: image.UseCustomImageURLCommand is active. Method GetDetails() should be used for the traditional way (without image-url-command)."
}
switch {
case image.Name != "" && image.URL != "":
suffix, err := GetImageSuffix(image.URL)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,6 +396,11 @@ spec:
a tar, tar.gz, tar.bz, tar.bz2, tar.xz, tgz, tbz, txz
image.
type: string
useCustomImageURLCommand:
description: |-
UseCustomImageURLCommand makes the controller use the command provided by `--baremetal-image-url-command` instead of installimage.
Docs: https://syself.com/docs/caph/developers/image-url-command
type: boolean
type: object
logicalVolumeDefinitions:
description: LVMDefinitions defines the logical volume definitions
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,11 @@ spec:
description: URL defines the remote URL for downloading a
tar, tar.gz, tar.bz, tar.bz2, tar.xz, tgz, tbz, txz image.
type: string
useCustomImageURLCommand:
description: |-
UseCustomImageURLCommand makes the controller use the command provided by `--baremetal-image-url-command` instead of installimage.
Docs: https://syself.com/docs/caph/developers/image-url-command
type: boolean
type: object
logicalVolumeDefinitions:
description: LVMDefinitions defines the logical volume definitions
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,11 @@ spec:
a tar, tar.gz, tar.bz, tar.bz2, tar.xz, tgz, tbz,
txz image.
type: string
useCustomImageURLCommand:
description: |-
UseCustomImageURLCommand makes the controller use the command provided by `--baremetal-image-url-command` instead of installimage.
Docs: https://syself.com/docs/caph/developers/image-url-command
type: boolean
type: object
logicalVolumeDefinitions:
description: LVMDefinitions defines the logical volume
Expand Down
57 changes: 57 additions & 0 deletions controllers/hetznerbaremetalhost_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@ import (
"reflect"
"time"

"github.com/google/go-cmp/cmp"
corev1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/klog/v2"
clusterv1 "sigs.k8s.io/cluster-api/api/v1beta1"
"sigs.k8s.io/cluster-api/util"
Expand All @@ -47,6 +49,7 @@ import (
robotclient "github.com/syself/cluster-api-provider-hetzner/pkg/services/baremetal/client/robot"
sshclient "github.com/syself/cluster-api-provider-hetzner/pkg/services/baremetal/client/ssh"
"github.com/syself/cluster-api-provider-hetzner/pkg/services/baremetal/host"
"github.com/syself/cluster-api-provider-hetzner/pkg/utils"
)

// HetznerBareMetalHostReconciler reconciles a HetznerBareMetalHost object.
Expand All @@ -59,6 +62,7 @@ type HetznerBareMetalHostReconciler struct {
WatchFilterValue string
PreProvisionCommand string
SSHAfterInstallImage bool
ImageURLCommand string
}

//+kubebuilder:rbac:groups=infrastructure.cluster.x-k8s.io,resources=hetznerbaremetalhosts,verbs=get;list;watch;create;update;patch;delete
Expand Down Expand Up @@ -88,6 +92,58 @@ func (r *HetznerBareMetalHostReconciler) Reconcile(ctx context.Context, req ctrl
return reconcile.Result{}, err
}

// ----------------------------------------------------------------
// Start: avoid conflict errors. Wait until local cache is up-to-date
// Won't be needed once this was implemented:
// https://github.com/kubernetes-sigs/controller-runtime/issues/3320
initialHost := bmHost.DeepCopy()
defer func() {
// We can potentially optimize this further by ensuring that the cache is up to date only in
// the cases where an outdated cache would lead to problems. Currently, we ensure that the
// cache is up to date in all cases, i.e. for all possible changes to the
// HetznerBareMetalHost object.
if cmp.Equal(initialHost, bmHost) {
// Nothing has changed. No need to wait.
return
}
startReadOwnWrite := time.Now()

// The object changed. Wait until the new version is in the local cache

// Get the latest version from the apiserver.
apiserverHost := &infrav1.HetznerBareMetalHost{}

// Use uncached APIReader
err := r.APIReader.Get(ctx, client.ObjectKeyFromObject(bmHost), apiserverHost)
if err != nil {
reterr = errors.Join(reterr,
fmt.Errorf("failed get HetznerBareMetalHost via uncached APIReader: %w", err))
return
}

apiserverRV := apiserverHost.ResourceVersion

err = wait.PollUntilContextTimeout(ctx, 100*time.Millisecond, 3*time.Second, true, func(ctx context.Context) (done bool, err error) {
// new resource, read from local cache
latestFromLocalCache := &infrav1.HetznerBareMetalHost{}
getErr := r.Get(ctx, client.ObjectKeyFromObject(apiserverHost), latestFromLocalCache)
if apierrors.IsNotFound(getErr) {
// the object was deleted. All is fine.
return true, nil
}
if getErr != nil {
return false, getErr
}
return utils.IsLocalCacheUpToDate(latestFromLocalCache.ResourceVersion, apiserverRV), nil
})
if err != nil {
log.Error(err, "cache sync failed after BootState change")
}
log.Info("Wait for update being in local cache", "durationWaitForLocalCacheSync", time.Since(startReadOwnWrite).Round(time.Millisecond))
}()
// End: avoid conflict errors. Wait until local cache is up-to-date
// ----------------------------------------------------------------

initialProvisioningState := bmHost.Spec.Status.ProvisioningState
defer func() {
if initialProvisioningState != bmHost.Spec.Status.ProvisioningState {
Expand Down Expand Up @@ -203,6 +259,7 @@ func (r *HetznerBareMetalHostReconciler) Reconcile(ctx context.Context, req ctrl
RescueSSHSecret: rescueSSHSecret,
SecretManager: secretManager,
PreProvisionCommand: r.PreProvisionCommand,
ImageURLCommand: r.ImageURLCommand,
SSHAfterInstallImage: r.SSHAfterInstallImage,
})
if err != nil {
Expand Down
24 changes: 16 additions & 8 deletions docs/caph/04-developers/06-image-url-command.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,29 @@ sidebar: image-url-command
description: Documentation on the CAPH image-url-command
---

The `--hcloud-image-url-command` for the caph controller can be used to execute a custom command to
install the node image.
The `--hcloud-image-url-command` and `--baremtal-image-url-command` for the caph controller can be
used to execute a custom command to install the node image.

This provides you a flexible way to create nodes.

The script/binary will be copied into the Hetzner Rescue System and executed.
The script/binary will be copied into the rescue system and executed.

You need to enable two things:

* The caph binary must get argument. Example:
`--hcloud-image-url-command=/shared/image-url-command.sh`
* The hcloudmachine resource must have spec.imageURL set (usually via a hcloudmachinetemplate)
`--[hcloud|baremetal]-image-url-command=/shared/image-url-command.sh`
* for hcloud: The hcloudmachine resource must have spec.imageURL set (usually via a
hcloudmachinetemplate)
* for baremetal: The hetznerbaremetal resource must use `useCustomImageURLCommand: true`.

The command will get the imageURL, bootstrap-data and machine-name of the corresponding
hcloudmachine as argument.
The command will get the imageURL, bootstrap-data, machine-name of the corresponding
machine and the root devices (seperated by spaces) as argument.

Example:

```bash
/root/image-url-command oci://example.com/yourimage:v1 /root/bootstrap.data my-md-bm-kh57r-5z2v8-zdfc9 'sda sdb'
```

It is up to the command to download from that URL and provision the disk accordingly. This command
must be accessible by the controller pod. You can use an initContainer to copy the command to a
Expand All @@ -36,7 +44,7 @@ A Kubernetes event will be created in both (success, failure) cases containing t
and stderr) of the script. If the script takes longer than 7 minutes, the controller cancels the
provisioning.

We measured these durations:
We measured these durations for hcloud:

| oldState | newState | avg(s) | min(s) | max(s) |
|----------|----------|-------:|-------:|-------:|
Expand Down
33 changes: 26 additions & 7 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,8 @@ var (
syncPeriod time.Duration
rateLimitWaitTime time.Duration
preProvisionCommand string
imageURLCommand string
hcloudImageURLCommand string
baremetalImageURLCommand string
skipWebhooks bool
sshAfterInstallImage bool
)
Expand All @@ -108,7 +109,8 @@ func main() {
fs.DurationVar(&rateLimitWaitTime, "rate-limit", 5*time.Minute, "The rate limiting for HCloud controller (e.g. 5m)")
fs.BoolVar(&hcloudclient.DebugAPICalls, "debug-hcloud-api-calls", false, "Debug all calls to the hcloud API.")
fs.StringVar(&preProvisionCommand, "pre-provision-command", "", "Command to run (in rescue-system) before installing the image on bare metal servers. You can use that to check if the machine is healthy before installing the image. If the exit value is non-zero, the machine is considered unhealthy. This command must be accessible by the controller pod. You can use an initContainer to copy the command to a shared emptyDir.")
fs.StringVar(&imageURLCommand, "hcloud-image-url-command", "", "Command to run (in rescue-system) to provision an hcloud machine. The command will get the imageURL, bootstrap-data and machine-name of the corresponding hcloudmachine as argument. It is up to the command to download from that URL and provision the disk accordingly. This command must be accessible by the controller pod. You can use an initContainer to copy the command to a shared emptyDir. The env var OCI_REGISTRY_AUTH_TOKEN from the caph process will be set for the command, too. The command must end with the last line containing IMAGE_URL_DONE. Otherwise the execution is considered to have failed. Docs: https://syself.com/docs/caph/developers/image-url-command")
fs.StringVar(&hcloudImageURLCommand, "hcloud-image-url-command", "", "Command to run (in rescue-system) to provision an hcloud machine. Docs: https://syself.com/docs/caph/developers/image-url-command")
fs.StringVar(&baremetalImageURLCommand, "baremetal-image-url-command", "", "Command to run (in rescue-system) to provision an baremetal machine. Docs: https://syself.com/docs/caph/developers/image-url-command")
fs.BoolVar(&skipWebhooks, "skip-webhooks", false, "Skip setting up of webhooks. Together with --leader-elect=false, you can use `go run main.go` to run CAPH in a cluster connected via KUBECONFIG. You should scale down the caph deployment to 0 before doing that. This is only for testing!")
fs.BoolVar(&sshAfterInstallImage, "baremetal-ssh-after-install-image", true, "Connect to the baremetal machine after install-image and ensure it is provisioned. Current default is true, but we might change that to false. Background: Users might not want the controller to be able to ssh onto the servers")

Expand All @@ -133,22 +135,38 @@ func main() {
}
}

// If ImageURLCommand is set, check if the file exists and validate the basename.
if imageURLCommand != "" {
baseName := filepath.Base(imageURLCommand)
// If hcloudImageURLCommand is set, check if the file exists and validate the basename.
if hcloudImageURLCommand != "" {
baseName := filepath.Base(hcloudImageURLCommand)
if !commandRegex.MatchString(baseName) {
msg := fmt.Sprintf("basename (%s) must match the regex %s", baseName, commandRegex.String())
setupLog.Error(errors.New(msg), "")
os.Exit(1)
}

_, err := os.Stat(imageURLCommand)
_, err := os.Stat(hcloudImageURLCommand)
if err != nil {
setupLog.Error(err, "hcloud-image-url-command not found")
os.Exit(1)
}
}

// If baremetalImageURLCommand is set, check if the file exists and validate the basename.
if baremetalImageURLCommand != "" {
baseName := filepath.Base(baremetalImageURLCommand)
if !commandRegex.MatchString(baseName) {
msg := fmt.Sprintf("basename (%s) must match the regex %s", baseName, commandRegex.String())
setupLog.Error(errors.New(msg), "")
os.Exit(1)
}

_, err := os.Stat(baremetalImageURLCommand)
if err != nil {
setupLog.Error(err, "baremetal-image-url-command not found")
os.Exit(1)
}
}

var watchNamespaces map[string]cache.Config
if watchNamespace != "" {
watchNamespaces = map[string]cache.Config{
Expand Down Expand Up @@ -215,7 +233,7 @@ func main() {
HCloudClientFactory: hcloudClientFactory,
SSHClientFactory: sshclient.NewFactory(),
WatchFilterValue: watchFilterValue,
ImageURLCommand: imageURLCommand,
ImageURLCommand: hcloudImageURLCommand,
}).SetupWithManager(ctx, mgr, controller.Options{MaxConcurrentReconciles: hcloudMachineConcurrency}); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "HCloudMachine")
os.Exit(1)
Expand All @@ -240,6 +258,7 @@ func main() {
RateLimitWaitTime: rateLimitWaitTime,
WatchFilterValue: watchFilterValue,
PreProvisionCommand: preProvisionCommand,
ImageURLCommand: baremetalImageURLCommand,
SSHAfterInstallImage: sshAfterInstallImage,
}).SetupWithManager(ctx, mgr, controller.Options{MaxConcurrentReconciles: hetznerBareMetalHostConcurrency}); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "HetznerBareMetalHost")
Expand Down
3 changes: 3 additions & 0 deletions pkg/scope/baremetalhost.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ type BareMetalHostScopeParams struct {
RescueSSHSecret *corev1.Secret
SecretManager *secretutil.SecretManager
PreProvisionCommand string
ImageURLCommand string
SSHAfterInstallImage bool
}

Expand Down Expand Up @@ -101,6 +102,7 @@ func NewBareMetalHostScope(params BareMetalHostScopeParams) (*BareMetalHostScope
cluster: params.Cluster,
hetznerCluster: params.HetznerCluster,
},
ImageURLCommand: params.ImageURLCommand,
}, nil
}

Expand All @@ -120,6 +122,7 @@ type BareMetalHostScope struct {
PreProvisionCommand string
SSHAfterInstallImage bool
WorkloadClusterClientFactory WorkloadClusterClientFactory
ImageURLCommand string
}

// Name returns the HetznerCluster name.
Expand Down
15 changes: 8 additions & 7 deletions pkg/scope/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,14 @@ import (

// ClusterScopeParams defines the input parameters used to create a new scope.
type ClusterScopeParams struct {
Client client.Client
APIReader client.Reader
Logger logr.Logger
HetznerSecret *corev1.Secret
HCloudClient hcloudclient.Client
Cluster *clusterv1.Cluster
HetznerCluster *infrav1.HetznerCluster
Client client.Client
APIReader client.Reader
Logger logr.Logger
HetznerSecret *corev1.Secret
HCloudClient hcloudclient.Client
Cluster *clusterv1.Cluster
HetznerCluster *infrav1.HetznerCluster
ImageURLCommand string
}

// NewClusterScope creates a new Scope from the supplied parameters.
Expand Down
Loading
Loading