Skip to content

Conversation

Frapschen
Copy link
Contributor

What type of PR is this?
/kind feature

What this PR does / why we need it:
Set up the trace instrument, and update the manifest of inferencepool chart.

This is the initial PR for tracing, which simply sets up a global TracerProvider via an init function. Subsequent tracing-related PRs can then focus solely on adding spans:

ctx, span := tracer.Start(r.Context(), "hello-span")
defer span.End()

// do some work to track with hello-span

Which issue(s) this PR fixes:
issue: #1520

Does this PR introduce a user-facing change?:

Set up the trace instrument, and update the manifest of inferencepool chart

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 23, 2025
Copy link

netlify bot commented Sep 23, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 81e1af7
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68db5e29934b000008bf6f5b
😎 Deploy Preview https://deploy-preview-1638--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 23, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 23, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @Frapschen. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 23, 2025
Copy link
Contributor

@JeffLuoo JeffLuoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit for typo in logs to replace MeterProvider with TracerProvider and other LGTM. Will let @liu-cong @nirrozenbaum @ahg-g to take another review.

@ahg-g
Copy link
Contributor

ahg-g commented Sep 25, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 25, 2025
@Frapschen
Copy link
Contributor Author

The error is:

go: downloading github.com/x448/float16 v0.8.4
go: downloading golang.org/x/text v0.17.0
# golang.org/x/tools/internal/tokeninternal
../../../pkg/mod/golang.org/x/[email protected]/internal/tokeninternal/tokeninternal.go:64:9: invalid array length -delta * delta (constant -256 of type int64)
make: *** [Makefile:400: /home/prow/go/src/sigs.k8s.io/gateway-api-inference-extension/bin/controller-gen] Error 1
+ EXIT_VALUE=2
+ set +o xtrace
Cleaning up after docker in docker.
================================================================================
Waiting 30 seconds for pods stopped with terminationGracePeriod:30

The make test works fine on my Mac

@JeffLuoo
Copy link
Contributor

/retest

Copy link
Contributor

@liu-cong liu-cong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffLuoo If the telemetry.go looks good to you, can you lgtm on that? I can lgtm on the flag and helm changes.

enabled: false
trace:
enabled: false
otelExporterEndpoint: "http://localhost:4317"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think it's "rare" for users to configre the endpoint and sampling params? If so, I suggest not including them here, and just let those rare, advanced users to configure via the env var section directly.

Copy link
Contributor Author

@Frapschen Frapschen Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liu-cong Most of the time, users should set a real OTel collector address, so I think it's a very useful setting.

And for sampling, it is also a common setting for tracing configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 that the endpoint and the sampling rate are commonly configured fields.

Copy link
Contributor

@liu-cong liu-cong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

traceExporter, err := initTraceExporter(ctx, logger)
if err != nil {
loggerWrap.Handle(fmt.Errorf("%s: %v", "init trace exporter fail", err))
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this return an error?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 30, 2025
@liu-cong
Copy link
Contributor

/lgtm cancel

I have a question on the error return, left a comment

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 30, 2025
@Frapschen Frapschen requested a review from liu-cong September 30, 2025 04:20
Copy link
Contributor

@JeffLuoo JeffLuoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits from my side. Other than that it LGTM. @liu-cong please give it another look.

var traceExporter sdktrace.SpanExporter
traceExporter, err := stdouttrace.New(stdouttrace.WithPrettyPrint())
if err != nil {
return nil, fmt.Errorf("fail to create stdouttrace exporter: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fail -> failed to align with other error messages in the same file.

if exporterType == "otlp" {
traceExporter, err = otlptracegrpc.New(ctx, otlptracegrpc.WithInsecure())
if err != nil {
return nil, fmt.Errorf("fail to create otlp-grcp exporter: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: use failed

loggerWrap.Handle(fmt.Errorf("%s: %v", "failed to shutdown TraceProvider", err))
}

logger.Info("trace provider shutting down")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also use logging.DEFAULT for the info message?


traceExporter, err := initTraceExporter(ctx, logger)
if err != nil {
loggerWrap.Handle(fmt.Errorf("%s: %v", "init trace exporter fail", err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: init trace exporter failed


sampler = sdktrace.ParentBased(sdktrace.TraceIDRatioBased(fraction))
} else {
loggerWrap.Handle(fmt.Errorf("un supported sampler type: %s, fallback to parentbased_traceidratio with 0.1 Ratio", samplerType))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Unsupported without space.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Frapschen, JeffLuoo
Once this PR has been reviewed and has the lgtm label, please assign danehans for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants