provider: enable k3d cluster log export helper #485

harshanarayana · 2025-01-17T10:46:41Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Enable k3d provider with Ability to export cluster logs. Changes required to enable this was added to k3d with k3d-io/k3d#1471.

With that changes being merged, we can now add support for exporting the logs.

Which issue(s) this PR fixes:

NA

Special notes for your reviewer:

Does this PR introduce a user-facing change?

added ability to export cluster logs for `k3d` based Cluster Provider

Additional documentation e.g., Usage docs, etc.:

k8s-ci-robot · 2025-01-17T10:46:44Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot · 2025-01-17T10:46:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: harshanarayana

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [harshanarayana]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

harshanarayana · 2025-01-17T10:48:08Z

/assign @cpanato

PTAL when you can. @vladimirvivien @cpanato

harshanarayana

/hold until I finish validating this and add an Example for using this

cpanato

thanks

/lgtm

vladimirvivien · 2025-01-17T15:19:27Z

third_party/k3d/k3d.go

+		}
+		return nil
+	} else {
+		log.Warning("ExportLogs not implemented for k3d. Please use regular kubectl like commands to extract the logs from the cluster")


Do we know what that kubectl command would be to extract logs? Could we just use it as a fallback instead of just reporting a warning?

@vladimirvivien Actually, That kubectl like is probably a mistake. Just kubectl won't be enough. We will also have to get a few more logs. And the problem is, we have to find a way to group them in a proper format. it can be tricky. (Also a duplicate work to replicate the logic).

If we want to add an export mechanism, we might be better off writing some mechanism that can do this for any provider that doesn't have a log export support.

vcluster doesn't support. So is the case with kwok.

For kwok there is an adhoc kubectl logs equivalent of export but it does only for the kwok's own components.

How do we go about doing this ?

We also have to worry about fetching the journalctl logs + the dmesg + any other applicable logs. Might not be a bad idea to implement something generic that can be reused for any provider in the future that is lacking the log export capabilities

package utils import ( "bytes" "context" "fmt" "os" "path/filepath" v1 "k8s.io/api/core/v1" log "k8s.io/klog/v2" "sigs.k8s.io/e2e-framework/klient/k8s/resources" "sigs.k8s.io/e2e-framework/pkg/types" ) type LogCollector struct { resourceFetcher *resources.Resources baseDir string } func NewLogCollector(provider types.E2EClusterProvider, clusterName, destination string) (*LogCollector, error) { baseDir := filepath.Join(destination, fmt.Sprintf("debug-logs-%s", clusterName)) if err := os.MkdirAll(baseDir, os.FileMode(0755)); err != nil { log.ErrorS(err, "failed to create base dir required to collect the logs", "dir", destination) return nil, err } resourceFetcher, err := resources.New(provider.KubernetesRestConfig()) if err != nil { log.ErrorS(err, "failed to create resource fetcher") return nil, err } return &LogCollector{ resourceFetcher: resourceFetcher, baseDir: baseDir, }, nil } func (lc *LogCollector) CollectLogs(ctx context.Context) error { var namespaces v1.NamespaceList if err := lc.resourceFetcher.List(ctx, &namespaces); err != nil { log.ErrorS(err, "failed to list namespaces in the cluster") return err } for _, ns := range namespaces.Items { if err := lc.collectNamespaceLogs(ctx, ns.Name); err != nil { return err } } return nil } func (lc *LogCollector) collectNamespaceLogs(ctx context.Context, namespace string) error { log.V(3).InfoS("Collecting POD information for namespace", "namespace", namespace) var pods v1.PodList if err := lc.resourceFetcher.WithNamespace(namespace).List(ctx, &pods); err != nil { log.ErrorS(err, "failed to list pods in the namespace", "namespace", namespace) return err } for _, pod := range pods.Items { if err := lc.collectPodLogs(ctx, namespace, pod); err != nil { return err } } return nil } func (lc *LogCollector) collectPodLogs(ctx context.Context, namespace string, pod v1.Pod) error { uid := fmt.Sprintf("%s", pod.GetUID()) if hash, ok := pod.GetAnnotations()["kubernetes.io/config.hash"]; ok { uid = hash } podBaseDir := filepath.Join(lc.baseDir, fmt.Sprintf("%s_%s_%s", namespace, pod.Name, uid)) if err := os.MkdirAll(podBaseDir, os.FileMode(0755)); err != nil { return err } containers := append(pod.Spec.Containers, pod.Spec.InitContainers...) containerStatus := append(pod.Status.ContainerStatuses, pod.Status.InitContainerStatuses...) for _, container := range containers { if err := lc.collectContainerLogs(ctx, namespace, pod.Name, container, containerStatus, podBaseDir); err != nil { return err } } return nil } func (lc *LogCollector) collectContainerLogs(ctx context.Context, namespace, podName string, container v1.Container, containerStatus []v1.ContainerStatus, podBaseDir string) error { containerBaseDir := filepath.Join(podBaseDir, container.Name) if err := os.MkdirAll(containerBaseDir, os.FileMode(0755)); err != nil { return err } log.V(3).InfoS("Collecting logs for pod", "namespace", namespace, "pod", podName, "container", container.Name) var podLog bytes.Buffer if err := lc.resourceFetcher.GetPodLog(ctx, namespace, podName, container.Name, &podLog); err != nil { return err } restartCount := 0 for _, cs := range containerStatus { if cs.Name == container.Name { restartCount = int(cs.RestartCount) break } } if err := os.WriteFile(filepath.Join(containerBaseDir, fmt.Sprintf("%d.log", restartCount)), podLog.Bytes(), os.FileMode(0644)); err != nil { return err } return nil }

@cpanato @vladimirvivien Would something like this help ? We can also use the NodeLogQuery feature for fetching most of the node logs that we might want o collect.

@vladimirvivien ptal.. I would love tp get this one merged.

k8s-triage-robot · 2025-04-20T13:14:41Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

ShwethaKumbla · 2025-04-21T10:39:11Z

/remove-lifecycle stale

vladimirvivien · 2025-05-06T14:22:25Z

@harshanarayana are you still working on this ?

k8s-triage-robot · 2025-08-08T02:32:01Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2025-09-07T02:43:50Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle rotten
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-ci-robot · 2025-09-07T02:43:58Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-triage-robot · 2025-10-07T02:53:22Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen
Mark this PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2025-10-07T02:53:28Z

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen

Mark this PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

harshanarayana · 2025-10-23T11:29:50Z

/reopen
/remove-lifecycle rotten

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jan 17, 2025

k8s-ci-robot requested review from ShwethaKumbla and vladimirvivien January 17, 2025 10:46

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 17, 2025

provider: enable k3d cluster log export helper

edabcee

harshanarayana force-pushed the feature/enable/log-export-for-k3d-provider branch from 71cdeda to edabcee Compare January 17, 2025 10:47

k8s-ci-robot assigned cpanato Jan 17, 2025

harshanarayana commented Jan 17, 2025

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 17, 2025

cpanato reviewed Jan 17, 2025

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 17, 2025

vladimirvivien reviewed Jan 17, 2025

View reviewed changes

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 20, 2025

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 21, 2025

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 8, 2025

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 7, 2025

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 7, 2025

k8s-ci-robot closed this Oct 7, 2025

harshanarayana reopened this Oct 23, 2025

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 23, 2025

provider: enable k3d cluster log export helper #485

Are you sure you want to change the base?

provider: enable k3d cluster log export helper #485

Conversation

harshanarayana commented Jan 17, 2025

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., Usage docs, etc.:

Uh oh!

k8s-ci-robot commented Jan 17, 2025

Uh oh!

k8s-ci-robot commented Jan 17, 2025

Uh oh!

harshanarayana commented Jan 17, 2025

Uh oh!

harshanarayana left a comment

Choose a reason for hiding this comment

Uh oh!

cpanato left a comment

Choose a reason for hiding this comment

Uh oh!

vladimirvivien Jan 17, 2025

Choose a reason for hiding this comment

Uh oh!

harshanarayana Jan 20, 2025

Choose a reason for hiding this comment

Uh oh!

harshanarayana Jan 20, 2025

Choose a reason for hiding this comment

Uh oh!

harshanarayana Jan 20, 2025

Choose a reason for hiding this comment

Uh oh!

harshanarayana May 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

k8s-triage-robot commented Apr 20, 2025

Uh oh!

ShwethaKumbla commented Apr 21, 2025

Uh oh!

vladimirvivien commented May 6, 2025

Uh oh!

k8s-triage-robot commented Aug 8, 2025

Uh oh!

k8s-triage-robot commented Sep 7, 2025

Uh oh!

k8s-ci-robot commented Sep 7, 2025

Uh oh!

k8s-triage-robot commented Oct 7, 2025

Uh oh!

k8s-ci-robot commented Oct 7, 2025

Uh oh!

harshanarayana commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

harshanarayana May 10, 2025 •

edited

Loading