Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[elixir] Performance powered by OTel #538

Open
1 task
Tracked by #4
sl0thentr0py opened this issue Mar 10, 2023 · 20 comments
Open
1 task
Tracked by #4

[elixir] Performance powered by OTel #538

sl0thentr0py opened this issue Mar 10, 2023 · 20 comments
Assignees
Milestone

Comments

@sl0thentr0py
Copy link
Member

sl0thentr0py commented Mar 10, 2023

@krainboltgreene so there are several things that would need to be done to get performance tracing working with sentry-elixir. This SDK was so far handled by the community so it is not on par with what we call our Unified API for making common abstractions across all Sentry SDKs.

I will make 2 lists below, one of the bare minimum that would lead to quick opentelemetry based performance and the other more 'ideal' list where we make the SDK feature compatible in terms of performance with other SDKs.

Required

Optional

  • The elixir SDK is further also missing Hub (for concurrency) and Scope abstractions but technically they can be ignored if a quick path to opentelemetry support is desired.

I will make a new issue out of this to track it in sentry-elixir. Unfortunately, this is a non-trivial amount of development work, so I can't give you very clear cut instructions on how you can contribute, but feel free to try stuff out and make a PR if you're interested and we can collaborate.

If there is sufficient interest from the community, we can also potentially prioritize me working on this as well next quarter.

Originally posted by @sl0thentr0py in getsentry/sentry#40712 (reply in thread)

Other notes

Preview Give feedback
@tsloughter
Copy link

What does "Hub" for concurrency in Elixir look like? It looks to be a context store? Like how OpenTelemetry we use the process dictionary?

Does this mean Sentry does no intend to adopt the OpenTelemetry API?

Also, what is "opentelemetry based performance" mean here? Based on what I read of the SpanProcessor model it doesn't read like you'd be relying on the Otel SDK for span operations, so if you found performance better with OpenTelemetry I wouldn't be sure thats the case when combined with the SpanProcessor. But I don't yet fully understand what it is doing, so I may be wrong there :)

@sl0thentr0py
Copy link
Member Author

What does "Hub" for concurrency in Elixir look like? It looks to be a context store? Like how OpenTelemetry we use the process dictionary?

something like that yea, basically the Hub needs to be cloned per concurrency unit, so we'll need to spec out what that looks like for elixir.

Does this mean Sentry does no intend to adopt the OpenTelemetry API?

not directly, we have our own Tracing model and Ingestion so we will only support OpenTelemetry indirectly via the SpanProcessor pathway.

it doesn't read like you'd be relying on the Otel SDK for span operations

we'd rely on the Otel SDK for instrumenting and recording spans but we need to convert them to the Sentry model to be able to ingest and store them on our side, this is what the SpanProcessor would do.

cc @smeubank for high-level product design discussion ^

@tsloughter
Copy link

Now I see, so the SpanProcessor updates a global store of SpanId->SentrySpan and then OnEnd will update that based on the finished OpenTelemetry Span.

@thbar
Copy link

thbar commented Jun 19, 2023

If there is sufficient interest from the community, we can also potentially prioritize me working on this as well next quarter.

I'm not sure how to provide feedback on that, but we (at https://transport.data.gouv.fr) would be very interested to see performance monitoring supported for Elixir.

@josevalim
Copy link
Contributor

For what is worth, the Ruby one seems to be done via OpenTelemetry: https://docs.sentry.io/platforms/ruby/performance/instrumentation/opentelemetry/ ?

@tsloughter
Copy link

@josevalim I think it has both. And the OpenTelemetry option is awkward to implement -- uses a span processor to basically do a parallel tracking of spans. I don't know that there will be another option other than the processor though as long as Sentry requires the implementation to create Transactions.

@hkrutzer
Copy link

There is also a JS implementation using OpenTelemetry, indeed using a span processor.

@sl0thentr0py
Copy link
Member Author

as long as Sentry requires the implementation to create Transactions.

@tsloughter @josevalim we have an ongoing project to move away from our Transaction model gradually on the ingestion side, will keep this thread updated when we ship something production ready.

@jwaldrip
Copy link

Any update on this?

@whatyouhide
Copy link
Collaborator

@jwaldrip no, and we'll post updates if there are any, no worries!

@sl0thentr0py
Copy link
Member Author

I will actually start writing a spec for it this week!

@sl0thentr0py
Copy link
Member Author

oki, current status of sentry ingestion of otlp traces follows!

Spec

Business concerns

  • this will be shipped experimentally and as an alpha feature for starters, we will stabilize the feature and pricing somewhere at the end of Q2
  • elixir and node will be used as testing grounds for this new ingestion capability

Elixir SDK implications

  • elixir will not ship the old Transaction model like other SDKs at all, it will directly leverage OTLP and OpenTelemetry instrumentation
  • we will still want the Sentry SDK to be installed to setup the DSN / ingestion endpoint / trace exporter
  • the Sentry SDK setup will also take care of configuring sane defaults for the OpenTelemetry SDK for an 'out of the box' experience
  • also other things like sampling / trace propagation TBD - can be ignored for an MVP

I will start playing around with OpenTelemetry SDKs and exporters this week and update once ingestion works end-end.

whatyouhide added a commit that referenced this issue Feb 3, 2024
This was not released yet. See #538 (comment).
whatyouhide added a commit that referenced this issue Feb 3, 2024
This was not released yet. See #538 (comment).
@sl0thentr0py
Copy link
Member Author

sl0thentr0py commented Feb 12, 2024

update:
I have now added protobuf ingestion support because the elixir/erlang opentelemetry exporters only had http/protobuf support and not json.
getsentry/relay#3044

I will now test ingestion since this is in production.

@whatyouhide
Copy link
Collaborator

@sl0thentr0py lol this is fantastic news, I had already started working on the JSON export support 😄

@smeubank smeubank changed the title Investigate quickest path to otel + performance tracing [elixir] Performance powered by OTel Feb 20, 2024
@whatyouhide whatyouhide changed the title [elixir] Performance powered by OTel Performance powered by OTel Feb 24, 2024
@smeubank smeubank changed the title Performance powered by OTel [elixir] Performance powered by OTel Mar 6, 2024
@sl0thentr0py sl0thentr0py assigned solnic and unassigned sl0thentr0py Sep 2, 2024
@jwaldrip
Copy link

jwaldrip commented Dec 4, 2024

Any progress on this. We would love to trace things back to our API. :-)

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 4, 2024
@solnic
Copy link
Collaborator

solnic commented Dec 5, 2024

@jwaldrip hey Jason, yes! We're working on wrapping up #784 where traces are working via an OTel span processor, we're close! 😄

@jwaldrip
Copy link

Just checking in here. Our team is about to go down the distributing tracing route and we would love to use sentry. But we really cant afford to wait any longer for the feature to land.

@solnic
Copy link
Collaborator

solnic commented Mar 13, 2025

@jwaldrip hey Jason, I'm sorry it's taking so long. There are many moving pieces so it was a challenge to coordinate work. Coincidentally you asked about it on the same day when I got back to working on #853 PR! There's a rough high-level TODO there in the description. I'm currently in the process of revisiting updated opentelemetry deps and making sure that things work as expected. We're going to decide tomorrow how exactly we want to ship it, which parts will go first (most likely Phoenix/Ecto would be a high prio) and how we want to package it nicely so that it's a solid Sentry "works OOTB" experience with no friction 😄 You can watch progress on #853 for the time being. I'll give another update tomorrow with more details!

@smeubank
Copy link
Member

Sentry Elixir SDK with OpenTelemetry Support

This issue outlines the prioritized product feature requirements for the Sentry Elixir SDK to support OpenTelemetry under the hood. Breaking down the work into priority levels will help us ship iteratively and deliver value incrementally.

P0 (Critical)

  • OpenTelemetry Span Processor Support

    • Implement a Sentry span processor [develop spec]
    • Support capturing OTel spans and converting them to Sentry span model
  • Trace Propagation

    • Support W3C (?? baggage) Trace Context propagation [develop spec]
    • Connect traces between services (incoming and outgoing)
    • Maintain context across service boundaries
    • Ensure distributed tracing works with other OpenTelemetry-instrumented services
    • Tracing: W3C Trace Context Support team-sdks#41
      • not sure latest status on the W3C part tbh, but implementation detail

P1 (Must Have)

  • Tracing with Other Signals
    • Connect traces with errors (primary integration point)
    • Link exceptions to the current span/transaction
    • Support breadcrumbs derived from spans
    • Crons might be the only other signal form, for elixir?

P2 (Should have)

  • Performance Insights
    • Implement support for Sentry Performance Modules
    • Add span attributes for database operations
    • Add span attributes for HTTP client/server operations
    • Support for parameterized transactions/routes [browser develop spec did not find a BE focused one right away]

Implementation Considerations

  • Provide clear adoption path for existing Sentry Elixir SDK users, and existing OTel SDK users
  • Document integration patterns with popular Elixir frameworks (Phoenix, etc.)
  • Provide examples for manual and automatic instrumentation
  • Update Elixir Product Onbaording - project creation

@solnic
Copy link
Collaborator

solnic commented Mar 14, 2025

Alright I closed the other two PRs where discovery work took place but they were too big to review/merge. I've added #874 as a sub-issue here and started working on it under #875 which simply adds SpanProcessor and will be a much smaller and focused change. It covers the first item from the P0 requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

10 participants