-
Notifications
You must be signed in to change notification settings - Fork 8
Introduce Leader Election and use Deployments for agents #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This pull request is now in conflict. Could you fix it @davidepasquero? 🙏 |
Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]>
@starbops did you manage to look at this PR? Should I explain better or group the commits together? |
@davidepasquero Thanks for the contribution. Please also create a tracking issue on harvester/harvester and link it to this PR. |
@harvester/network let's prioritize this review when starting working on 1.7. cc @harvester/pm @harvester/maintainer |
Hi @davidepasquero, Could you create the HEP on the harvester/harvester and reference it on your original issue harvester/harvester#8794? We should review the HEP first, then we could follow the HEP to check the PR. |
|
This change introduces leader election to the VM DHCP agent, mirroring the mechanism used in the controller. This ensures that only one instance of the agent is active at a time, preventing potential conflicts and ensuring stable operation in a clustered environment. The implementation utilizes the `github.com/rancher/wrangler/pkg/leader` package. A `noLeaderElection` flag has been added to the agent's command-line options to allow bypassing leader election for development or testing purposes. Signed-off-by: davidepasquero <[email protected]>
Refactors the Helm chart to deploy the VM DHCP agent as a separate Kubernetes Deployment, rather than being managed directly by the controller. This change aligns with standard Kubernetes practices and improves resilience and manageability. Key changes include: - Added a new Deployment manifest for the agent (`agent-deployment.yaml`). - Introduced dedicated RBAC resources (ServiceAccount, ClusterRole, ClusterRoleBinding) for the agent to enable leader election and dynamic IPPool discovery. - Updated `values.yaml` to provide configuration options for the new agent deployment and its associated resources. - Modified the controller's Deployment manifest to remove the direct management and launching of the agent. This chart refactoring is a prerequisite for upcoming Go code changes in the agent to implement IPPool watching and in the controller to remove agent process management. The agent will now rely on its own leader election mechanism, implemented in a previous change. Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]>
This commit removes the controller's responsibility for creating, managing, and monitoring dedicated agent pods for IPPools. This functionality is being shifted to a separate agent deployment. Changes include: - Removed DeployAgent and MonitorAgent logic from IPPool controller. - Deleted status.agentPodRef and AgentReady condition from IPPool types. - Removed CLI flags and configuration options related to agent images, namespaces, service accounts, and the no-dhcp flag for agents. - Cleaned up associated helper functions, structs, and unused code. Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]>
The agent was previously hardcoded to use the 'kube-system' namespace for its leader election lock. This caused permission errors as the agent's ServiceAccount was not authorized for that namespace. This commit changes the agent to use its own namespace (obtained via the Downward API through the POD_NAMESPACE environment variable) for leader election. The Helm chart has been updated to provide this environment variable to the agent deployment. Signed-off-by: davidepasquero <[email protected]>
This commit updates the labels in `agent-clusterrole.yaml` to use the common labels helper and override the component, making it consistent with the approach used in `agent-clusterrolebinding.yaml`. This is a speculative change unlikely to fix a persistent 'forbidden' error if RBAC rules were otherwise correctly defined and applied, but it improves chart consistency. Signed-off-by: davidepasquero <[email protected]>
The webhook was failing TLS verification because its certificate was not valid for the FQDN the API server was using (e.g., `webhook-name.namespace.svc`). This was due to the underlying `harvester/webhook` library not receiving the correct namespace during certificate generation, leading to a malformed SAN (e.g., `webhook-name..svc`). This commit adds the `--namespace {{ .Release.Namespace }}` argument to the webhook container startup, ensuring the certificate is generated with the correct SANs. Note: A potential issue was observed where the webhook deployment specifies a ServiceAccount name (`{{fullname}}-webhook-webhook`) that might not be explicitly created by the chart. This may need further investigation if webhook pods encounter permission issues or fail to start due to SA problems, though it's unrelated to the certificate SAN issue. Signed-off-by: davidepasquero <[email protected]>
This commit changes the metadata.name for the agent's ClusterRole and ClusterRoleBinding to use a simpler naming convention based directly on `{{ .Release.Name }}`. This is an attempt to eliminate potential issues from complex name templating for these RBAC resources, in response to persistent 'configmaps forbidden' errors for the agent. The ServiceAccount name itself remains unchanged as it correctly matches the user identified in the error logs. Signed-off-by: davidepasquero <[email protected]>
This commit refactors the system to allow the main vm-dhcp-controller to dynamically update the vm-dhcp-agent's Deployment based on IPPool resources. When an IPPool is detected, the controller will now: 1. Parse `IPPool.spec.networkName` to determine the target NetworkAttachmentDefinition (NAD). 2. Update the agent Deployment's pod template to include the `k8s.v1.cni.cncf.io/networks` annotation pointing to this NAD. 3. Update the `--nic` argument of the agent container to use the interface name associated with the Multus attachment (e.g., `net1`). Changes include: - Added RBAC permissions for the controller to get, list, watch, and patch Deployments in its namespace. - Modified `pkg/controller/ippool/controller.go` to implement the logic for fetching and updating the agent Deployment. - Updated `pkg/config/context.go` and `cmd/controller/run.go` to provide necessary KubeClient and namespace information to the controller logic. - Reverted previous static Multus configuration from `values.yaml` and the agent's Helm template, as this is now handled dynamically. The controller relies on an `AGENT_DEPLOYMENT_NAME` environment variable (to be set in its own deployment manifest) to accurately locate the agent deployment for updates. Signed-off-by: davidepasquero <[email protected]>
This commit addresses several Go build errors in `pkg/controller/ippool/controller.go`: - Removed unused imports for `k8s.io/api/apps/v1` (appsv1) and `encoding/json`. - Added the missing import for the `os` package. - Modified the `getAgentDeploymentName` function to not accept arguments, as its implementation uses `os.Getenv` and did not use the parameter. - Updated the call to `getAgentDeploymentName` accordingly. Signed-off-by: davidepasquero <[email protected]>
…ller This commit addresses several build errors in `pkg/agent/ippool/controller.go` and corrects logic within its `Update` method: - Added missing imports for `sync` and `github.com/harvester/vm-dhcp-controller/pkg/util`. - Removed an unused variable `currentLeasesInAllocator`. - Corrected the logic for checking lease existence using `c.dhcpAllocator.GetLease()`. - Ensured correct handling of `LeaseTime` type conversion for `dhcpAllocator.AddLease()`. - Resolved method redeclaration error by deleting the duplicate file `pkg/agent/ippool/ippool.go`. These changes aim to fix the build and improve the correctness of the DHCP lease cache synchronization logic within the agent. Signed-off-by: davidepasquero <[email protected]>
This commit addresses multiple build errors in the agent's local ippool controller (`pkg/agent/ippool/controller.go`) and refines its lease synchronization logic: - Corrected `LeaseTime` type handling when calling `dhcpAllocator.AddLease`, passing `specConf.LeaseTime` directly as it's assumed to be `*int` based on compiler errors. - Added missing `sync` and `util` package imports. - Removed unused variables. - Corrected the logic for checking lease existence before deletion. - Deleted the conflicting `pkg/agent/ippool/ippool.go` file that caused a method redeclaration error. These changes are intended to fix the build and ensure the agent's DHCP lease cache is correctly populated from the IPPool status, particularly after failovers or restarts. Signed-off-by: davidepasquero <[email protected]>
Modifies the IPPool controller and DHCP agent to support configuring the agent for multiple active IPPools. Controller changes: - `reconcileAgentDeployment` now gathers all active IPPools. - It generates a consolidated Multus annotation for all required NADs. - It passes a JSON array of `AgentNetConfig` (interface name, server IP, CIDR, IPPool reference, NAD name) to the agent via the `AGENT_NETWORK_CONFIGS` environment variable. - It passes a JSON array of IPPool references (`namespace/name`) via the `IPPOOL_REFS_JSON` environment variable. - Old single-pool arguments and env vars are removed from the agent deployment spec. - `OnChange` and `OnRemove` handlers trigger this global reconciliation. Agent changes: - AgentOptions now expect `AGENT_NETWORK_CONFIGS` and `IPPOOL_REFS_JSON` env vars. - CLI flags for single IPPool config (`--nic`, `--server-ip`, etc.) removed. - Agent parses `AGENT_NETWORK_CONFIGS` into a list of `AgentNetConfig`. - `configureInterfaces` method now iterates this list to set IPs on all specified interfaces. - `DHCPAllocator.Run` is now called with the list of `AgentNetConfig`. (Note: `DHCPAllocator` internals require further updates to fully support multi-interface DHCP serving). - Agent-side `ippool.EventHandler` is no longer actively used, as configuration is now primarily controller-driven at pod startup. Helm template changes: - Agent deployment template updated to remove old CLI args and `IPPOOL_REF` env var. - Added `AGENT_NETWORK_CONFIGS` and `IPPOOL_REFS_JSON` env vars with default empty JSON array values. This change allows a single agent deployment to serve DHCP for multiple IPPools, each on a dedicated, Multus-managed network interface. Signed-off-by: davidepasquero <[email protected]>
Removes the import of "k8s.io/apimachinery/pkg/types" from pkg/config/context.go as it was no longer used after refactoring agent options. This fixes the "imported and not used" build error. This commit is part of the larger feature to configure the agent for multiple IPPools. Signed-off-by: davidepasquero <[email protected]>
- Removes unused duplicate import alias for "k8s.io/apimachinery/pkg/api/errors". - Corrects a type mismatch error in `reconcileAgentDeployment` by removing an invalid `ipPool.Spec.IPv4Config == nil` comparison. The existing checks for essential fields within `IPv4Config` (ServerIP, CIDR) correctly validate the configuration's usability. These changes address build failures reported by the make process. Signed-off-by: davidepasquero <[email protected]>
- Replaces `k8serrors.IsNotFound` with `errors.IsNotFound` to align with the direct import of "k8s.io/apimachinery/pkg/api/errors". This fixes the "undefined: k8serrors" build error. - Verifies that `errors.IsNotFound` is used correctly for checking if Deployments or NetworkAttachmentDefinitions are not found. This commit addresses build failures related to error handling an imports in the IPPool controller. Signed-off-by: davidepasquero <[email protected]>
Temporarily modifies the call to `DHCPAllocator.Run` and `DryRun` in `pkg/agent/agent.go` to pass a single interface name (the first configured one) instead of the `[]AgentNetConfig` slice. This resolves the compilation error "cannot use a.netConfigs (variable of type []AgentNetConfig) as string value in argument". This is a temporary measure to allow the agent to compile. The `DHCPAllocator` component still requires substantial refactoring to correctly process the `[]AgentNetConfig` and provide DHCP services on multiple interfaces. This underlying work is a pending TODO. This commit allows the current set of changes for multi-IPPool agent configuration to build successfully. Signed-off-by: davidepasquero <[email protected]>
Removes the flag definitions for --ippool-ref, --nic, --server-ip, and --cidr from cmd/agent/root.go. These flags were made obsolete by the introduction of controller-driven configuration via environment variables (AGENT_NETWORK_CONFIGS, IPPOOL_REFS_JSON). This resolves the "undefined: ippoolRef" (and similar) build errors caused by referencing global variables that were removed along with the flags they supported. Signed-off-by: davidepasquero <[email protected]>
Removes the goroutine in the IPPool controller's Register function that called `reconcileAgentDeployment` directly on startup. This goroutine could race with cache initialization, leading to a nil pointer dereference when accessing `ippoolCache`. The initial agent deployment reconciliation will now occur organically as existing IPPool resources trigger OnChange events upon controller startup and cache synchronization. This ensures `reconcileAgentDeployment` is always called with a properly synced cache. Signed-off-by: davidepasquero <[email protected]>
Removes the goroutine in the IPPool controller's Register function that called `reconcileAgentDeployment` directly on startup. This goroutine could race with cache initialization, leading to a nil pointer dereference when accessing `ippoolCache`. The initial agent deployment reconciliation will now occur organically as existing IPPool resources trigger OnChange events upon controller startup and cache synchronization. This ensures `reconcileAgentDeployment` is always called with a properly synced cache. Signed-off-by: davidepasquero <[email protected]>
Prevent a nil pointer dereference in reconcileAgentDeployment by waiting for the IPPoolCache to sync before attempting to list IPPools. This is achieved by: 1. Adding k8s.io/client-go/tools/cache.WaitForCacheSync at the beginning of the reconcileAgentDeployment function. 2. Passing the correct parent context to reconcileAgentDeployment from its callers in wrappedOnChange and wrappedOnRemove handlers to ensure WaitForCacheSync behaves correctly with context cancellation. Signed-off-by: davidepasquero <[email protected]>
Corrects the compilation error by changing the cache sync check in reconcileAgentDeployment from h.ippoolController.HasSynced() to h.ippoolCache.HasSynced(). The IPPoolCache interface (generic.CacheInterface) provides the HasSynced method, ensuring the check is valid and correctly waits for the cache to be ready before proceeding with listing IPPools. Signed-off-by: davidepasquero <[email protected]>
Corrects the cache readiness check in reconcileAgentDeployment to use h.ippoolController.Informer().HasSynced(). This is the standard method to verify if the controller's underlying informer has completed its initial synchronization. This change aims to resolve previous compilation errors and the original nil pointer dereference by ensuring operations on the cache only occur after it's confirmed to be synced. Signed-off-by: davidepasquero <[email protected]>
Adds extensive logging within the reconcileAgentDeployment function to track context status, HasSynced results, and WaitForCacheSync behavior. This is intended to help diagnose a persistent runtime panic occurring during ippoolCache.List, even after attempts to ensure cache synchronization. The logs will provide more insight into the state of the cache and context immediately before the failing operation. Signed-off-by: davidepasquero <[email protected]>
This change modifies the DHCP agent and allocator to support running concurrent DHCP server instances, each corresponding to a different network configuration (IPPool) and listening on a dedicated network interface within the agent pod. Key changes: - DHCPAllocator in `pkg/dhcp/dhcp.go` now manages multiple server instances. Its `Run` and `DryRun` methods accept a list of network configurations. Lease management is now per IPPool. - The Agent in `pkg/agent/agent.go` passes all its configured network settings to the DHCPAllocator. - Cleanup logic has been updated to stop all managed DHCP servers. - The ippool controller's method of configuring the agent via AGENT_NETWORK_CONFIGS environment variable and Multus annotations is compatible with these enhancements. Signed-off-by: davidepasquero <[email protected]>
Moved misplaced import statements from AgentNetConfig struct definition to the main import block. This resolves compilation errors. This fix is applied on top of the previous changes for multi-interface DHCP agent functionality. Signed-off-by: davidepasquero <[email protected]>
The `reconcileAgentDeployment` function could panic due to a nil pointer dereference when calling `h.ippoolCache.List(...)`. This occurred because `h.ippoolCache` was initialized early during controller registration, at a point where its internal indexer might not yet have been fully initialized by the informer, even if the informer was later reported as synced. This commit modifies `reconcileAgentDeployment` to fetch a fresh IPPoolCache instance from `h.ippoolController.Cache()` immediately after the informer synchronization checks. This ensures that the cache instance used for listing IPPools is based on the controller's current state and has a properly initialized indexer, thus preventing the panic. Signed-off-by: davidepasquero <[email protected]>
Previous attempts to fix a nil pointer dereference in `reconcileAgentDeployment` by refreshing the IPPoolCache instance were not successful. The panic persisted, indicating that the cache's internal indexer was still nil even after informer sync checks. This commit revises the approach to list IPPools. Instead of relying on `IPPoolController.Cache().List()`, the `reconcileAgentDeployment` function now directly accesses the controller's informer store via `h.ippoolController.Informer().GetStore().List()`. The resulting list of objects is then iterated, with type assertions to `*networkv1.IPPool`. This more direct method bypasses the potentially problematic caching layer and relies on the informer's store, which should be consistent once the informer has synced. This is expected to resolve the nil pointer dereference when listing IPPools. Signed-off-by: davidepasquero <[email protected]>
Modified `DHCPAllocator.Run` in `pkg/dhcp/dhcp.go` to block and wait for context cancellation. Previously, it returned immediately after launching server goroutines, causing the agent's error group context to cancel too early, leading to the DHCP servers being stopped by the cleanup routine. With this change, `DHCPAllocator.Run` remains active as long as its input context is active, ensuring DHCP servers continue to run until the agent is intentionally terminated or the controlling context is cancelled. Signed-off-by: davidepasquero <[email protected]>
This commit addresses an issue where the DHCP agent would report "NO LEASE FOUND" for MAC addresses that had leases allocated in the IPPool CRD. The main changes include: 1. Re-enabled and adapted the ippool.EventHandler and ippool.Controller to monitor IPPool objects for changes. 2. The Agent now creates an EventHandler instance for each unique IPPoolRef it is configured to manage. 3. Corrected the API calls in ippool.Controller to DHCPAllocator's AddLease and DeleteLease methods to include the required `ipPoolRef` string. 4. The Agent now waits for all configured EventHandlers to complete their initial synchronization (signaled by `InitialSyncDone`) before fully starting the DHCP server logic. This ensures pre-existing leases are loaded. 5. Each EventHandler now manages its own local lease cache (`poolCache`). These changes ensure that the DHCPAllocator in the agent has an accurate and up-to-date view of the leases as defined in the IPPool status, both upon startup and during dynamic updates to the IPPool objects. Signed-off-by: davidepasquero <[email protected]>
Added missing import statements for ippool, cache, and types packages to resolve compilation errors. Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]>
Adds detailed logging for incoming and outgoing DHCP packets (Discover, Offer, Request, Ack) to help diagnose IP assignment issues. - Logs the full packet content at INFO level. - Includes IPPool reference in log messages for better context. Signed-off-by: davidepasquero <[email protected]>
…ange Previously, my IPPool controller would only signal that its initial synchronization was complete upon processing an UPDATE event for its target IPPool. This caused me to hang if the IPPool existed at startup (generating an ADD event) but wasn't immediately updated. This commit modifies my IPPool controller to also signal initial sync completion when processing an ADD event or when a DELETE event confirms the target IPPool is no longer present. Additionally, my IPPool event listener's informer is now scoped to watch only the specific IPPool it is responsible for, rather than all IPPools in the namespace. Event handlers in the listener are also updated to correctly queue ADD and DELETE events. Signed-off-by: davidepasquero <[email protected]>
The leader election callback in the agent was previously panicking if the agent encountered an error, including context.Canceled which is expected during graceful shutdown. This commit modifies the callback to check if the error is context.Canceled. If so, it logs an informational message and exits gracefully. Other unexpected errors will still cause a panic. Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]> Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]>
Signed-off-by: davidepasquero <[email protected]> Signed-off-by: davidepasquero <[email protected]>
…its-and-list-modified-files-j2mb8k docs: remove architecture image
a8d3a17
to
abfedff
Compare
Approval Proposal: Agent Refactoring for Scalability and Stability
Hi @starbops,
This set of changes introduces a fundamental architectural evolution for the vm-dhcp-controller, transforming it into a more robust, scalable, and resilient solution. The main update shifts the agent management model from one pod per IPPool to a unified Deployment, while also introducing a leader election mechanism to ensure high availability.
1. Unified Deployment Architecture for Maximum Scalability
The most significant change is the move away from the "one pod per pool" model. Now, a single Deployment manages all the agent pods required for the active networks.
IPPools
and injects the aggregated configuration into the agent Deployment via environment variables.(Key References:
pkg/controller/ippool/controller.go
,chart/templates/agent-deployment.yaml
,pkg/agent/agent.go
)2. High Availability and Stability
We have introduced a leader election mechanism for the agent pods and fixed critical bugs to improve system reliability.
Leader Election for Agents:
Critical Stability Fixes:
nil pointer panic
that occurred at controller startup due to a race condition has been corrected. Reconciliation now waits for cache synchronization to complete.(Key References:
cmd/agent/run.go
,pkg/agent/ippool/event.go
,pkg/agent/ippool/controller.go
)3. Improved Observability
To speed up debugging and problem diagnosis, logging has been enhanced.
(Key Reference:
pkg/dhcp/dhcp.go
)Conclusion
These changes represent a crucial step forward for the
vm-dhcp-controller
, turning it into a mature, scalable, and resilient solution ready for adoption in complex production environments. I ask for your approval to merge these important improvements. Thank you.