Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[v17] RFD 184: automatic updates, server-side logic (#52275)
* Implement immediate schedule support for automatic updates (#47920) * Implement immediate schedule support * expose edition, fips, and ensure ping endpoint answers * fix after rebase * fix cache tests * introduce webclient.ReusableClient (#49296) * Move autoupdate code in proxy to make more sense (#49484) * Move autoupdate code in proxy to make more sense * lint + godoc * Start `autoupdate_agent_rollout` controller in auth service (#49101) * run autoupdate_agent_rollout controller * Recover from panics inside the controller * Address tim's feedback Co-authored-by: rosstimothy <[email protected]> --------- Co-authored-by: rosstimothy <[email protected]> * kube-agent-updater: add RFD-184 trigger and version getter (#49297) * add proxy version getter and maintenance trigger * add failover trigger and versionGetter * lint * Apply suggestions from code review Co-authored-by: Marco Dinis <[email protected]> * address marco's feedback * licensing --------- Co-authored-by: Marco Dinis <[email protected]> * Rename lib/kubernetestoken to lib/kube/token (#49554) * Rename lib/kubernetestoken to lib/kube/token * Lint * Make the proxy read from autoupdate_agent_rollout (#49380) * Add autoupdate_agenbt_rollout support * fix ping proxy tests * address creack's feedback * Address sclevine's feedback Co-authored-by: Stephen Levine <[email protected]> * fix panic in tests --------- Co-authored-by: Stephen Levine <[email protected]> * Fix flaky TestAutoUpdateAgentShouldUpdate (#49883) * Fix flaky TestAutoUpdateAgentShouldUpdate * Update lib/web/apiserver_ping_test.go * Update lib/web/autoupdate_common_test.go * autoupdate: reconcile rollout status and add strategy interface (#49735) * autoupdate: reconcile rollout status and add strategy interface * fix missing constants + add license * lint * fix proto field id * Fix flaky TestAgentRolloutController (#49886) * Fix falky TestAgentRolloutController * switch to real clock + increase Eventually timeout * Make reconciliation period a parameter + add TELEPORT_UNSTABLE env var * Update lib/service/service_test.go Co-authored-by: Alan Parra <[email protected]> * Apply suggestions from code review Co-authored-by: Alan Parra <[email protected]> * Remove env var * lint --------- Co-authored-by: Alan Parra <[email protected]> * Compute global rollout state (#49945) * Compute global rollout state * Simplify + missing wrong proto message description * lint * simplify * for edoardo * fix compute status test * autoupdate: implement time-based strategy (#49736) This commit implements the time-based rollout strategy describen in RFD 184. The autoupdate_agent_rollout controller will make the groups active based on their start days, start hour, and maintenance duration. Once the maintenance window is over, the group becomes DONE. In the DONE state, new agents will instalkl the target version but existing agents will no longer be told to actively update. * Use CMC as default config when set (#50039) * autoupdate: Use CMC as default config when set Part of: [RFD-184](#47126) This commit implements backward compatibility when CMC is specified. After this PR, if the user has no `autoupdate_config` resource but a `cluster_maintenance_config` resource from RFD 109, we will use the CMC to generate the config (update hour and update days) and craft the `autoupdate_agent_rollout`. * Update lib/autoupdate/rollout/client_test.go Co-authored-by: Edoardo Spadolini <[email protected]> * address feedback * lint --------- Co-authored-by: Edoardo Spadolini <[email protected]> * Change autoupdate proto messages (#50234) * Change autoupdate proto messages This commits does 3 changes: - reflect the maintenance duration on the rollout in a new spec field - add a rollout start time field in its status - change wait_days into wait_hours * int64 -> in32 for consistency with other fields * Add autoupdate_config and autoupdate_agent_rollout validation (#50181) This commit removes the restrictions of the autoupdate_agent_rollout and autoupdate_config schedules but adds groups validation. It also adds some optional server-side validation that should not be enforced at the resource level. * autoupdate: implement halt-on-error strategy (#49737) * autoupdate: implement halt-on-error strategy * rewrite wait_days logic into wait_hours * Apply suggestions from code review Co-authored-by: Stephen Levine <[email protected]> --------- Co-authored-by: Stephen Levine <[email protected]> * add tctl create/get/edit support for autoupdate_agent_rollout (#50393) * add tctl create/get/edit support for autoupdate_agent_rollout * fix bad copy paste * set rollout start date and don't start updating if rollout just changed (#50365) This commit does two changes: - the controller now sets the rollout start time when resetting the rollout - the controller will not start a group if the rollout changed during the maintenance window (checks if the rollout start time is in the window) * Reduce clock usage + add time and period override in rollout controller (#50634) * Enable strategies in the autoupdate rollout controller (#50635) * autoupdate rollout: honour the maintenance window duration (#50745) * autoupdate rollout: honour the maintenance window duration * Update lib/autoupdate/rollout/reconciler.go Co-authored-by: Bartosz Leper <[email protected]> * Address feedback * Update lib/autoupdate/rollout/strategy.go --------- Co-authored-by: Bartosz Leper <[email protected]> * Fix proto resource 153 marshalling for autoupdate_* resources (#50688) * Fix proto resource 153 marshalling * Update tool/tctl/common/collection_test.go Co-authored-by: Alan Parra <[email protected]> * Update tool/tctl/common/collection_test.go Co-authored-by: Alan Parra <[email protected]> * Address feedback - Change from Resource153AdapterV2 to ProtoResource153Adapter - fix test failures and unmarshal proto resources properly - add a failing round-trip proto 153 test case - bonus: fix the table tesst reosurce create that did not support running a single row * Apply suggestions from code review Co-authored-by: Alan Parra <[email protected]> * lint --------- Co-authored-by: Alan Parra <[email protected]> * Add autoupdate controller metrics (#50807) * Add autoupdate controller metrics * Do no panic in case of error conflict * kube-agent-update: Use the RFD-184 webapi proxy update protocol by default when possible (#50464) * kube-agent-update: Use the RFD-184 webapi proxy update protocol by default when possible * Update integrations/kube-agent-updater/cmd/teleport-kube-agent-updater/main.go Co-authored-by: Tiago Silva <[email protected]> * log update group --------- Co-authored-by: Tiago Silva <[email protected]> * Add 'tctl autoupdate agents status' (#51079) * Ensure proxy version getter adds the leading 'v' (#51687) * Always create debug socket and expose health endpoints (#51616) * Always create debug socket and expose health endpoints * Consolidate the diagnostic multiplexers in a single function * Fix tests * Apply suggestions from code review Co-authored-by: Edoardo Spadolini <[email protected]> --------- Co-authored-by: Edoardo Spadolini <[email protected]> * Fix autoupdate rollout controller metrics (#51803) * kube-agent-updater pre-release builds trust the staging repo + insecure validator private repo fix (#51815) * Fix insecure resolver in private repos + trust pre-release builds * fixup! Fix insecure resolver in private repos + trust pre-release builds * Use new autoupdate APIs in discovery service (#51758) * Remove name parameter from proxy version getter * Use autoupdate_agent_rollout as a source of version in scripts and integrations * Fix tests * Handle gracefully absence of a proxy in kube discovery sevrice * Update lib/srv/discovery/kube_integration_watcher.go Co-authored-by: Tiago Silva <[email protected]> * Address marco's feedback * Address marco's feedback pt.2 * Gracefully handle if we can't get autoupdate version * fixup! Update lib/srv/discovery/kube_integration_watcher.go --------- Co-authored-by: Tiago Silva <[email protected]> * Autoupdate changelog entry in v17.3 * Fix tests after rebase, pt.1 * Update front preset fixtures since the preset role changed * Add install script using teleport-update and oneoff.sh (#52155) * Refactor node-join script to take safer options and reuse install option logic (#52196) * Add install script using teleport-update and oneoff.sh * Refactor node-join script to take safer options and reuse install option logic * GoDoc + make functions private * Address edoardo's feedback * Allow prerelease Teleport to install official artifacts (#52444) * Accept to install CE when running an AGPL build for backeard compat * Bump e to fix build (oneoff args change) * Make node install scripts install Teleport via teleport-update (#52226) * Make the node install script use teleport-update * Apply suggestions from code review Co-authored-by: Edoardo Spadolini <[email protected]> * Fix curl args + address bash exec comments --------- Co-authored-by: Edoardo Spadolini <[email protected]> * Use install.sh in discovery's default installer (#52368) * Use install.sh in discovery's default installer * fixup! Use install.sh in discovery's default installer * Address marco's feedback * Update lib/auth/grpcserver.go Co-authored-by: Marco Dinis <[email protected]> * Update lib/srv/server/installer/defaultinstallers.go * apply edoard's feedback + write script to file * Execute the downloaded shell script * Add snapshot tests * fixup! Add snapshot tests --------- Co-authored-by: Marco Dinis <[email protected]> * Fix error after rebase * Fix test after rebase --------- Co-authored-by: rosstimothy <[email protected]> Co-authored-by: Marco Dinis <[email protected]> Co-authored-by: Stephen Levine <[email protected]> Co-authored-by: Alan Parra <[email protected]> Co-authored-by: Edoardo Spadolini <[email protected]> Co-authored-by: Bartosz Leper <[email protected]> Co-authored-by: Tiago Silva <[email protected]>
- Loading branch information