-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Refactor open telemetry plugin #328
base: main
Are you sure you want to change the base?
Conversation
@@ -0,0 +1,67 @@ | |||
// This must be run first. Node uses patching on implementations to inject telemetry so must run before the packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would not live inside the plugin as it is server setup for otel.
The way otel is setup is by patching of libraries such as fetch, for this reason it needs to run outside of the plugin and be one imported before most packages.
api.context.setGlobalContextManager(contextManager); | ||
|
||
const exporter = new OTLPTraceExporter({ | ||
url: "todo:configuration", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would need to drive this from configuration. I have done this within our gateway implementation but that has a different approach to the hive-gateway to config so Im not sure of the best approach.
@@ -0,0 +1,45 @@ | |||
import {ASTNode, DocumentNode, Kind, visit} from "graphql/index"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: test this
], | ||
logging: false, | ||
}); | ||
it('query should add spans', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: more extensive testing.
Im not sure on the best approach to testing this as a gateway, i.e. testing it calls a subgraph.
Query: { | ||
hello: () => 'World', | ||
ping: () => { | ||
expect(api.context.active()).not.toEqual(api.ROOT_CONTEXT); // proves that the context is propagated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to figure out a better way to test this.
Summary
This is a refactor to make the opentelemetry plugin in the hive gateway conform closer to the opentelemetry api spec specifically around how context is propagated within the node sdk.
The goal is to allow the gateway to take advantage of the instrumentations available in the otel libraries, for example http, dns, grpc etc. The big benefit of this is it allows custom plugins to benefit from the instrumentations which is a requirement that we have, where we create a grpc request within the context phase of GraphQL.
I put some effort into removing potentially sensitive data from traces, for example error messages. Im not sure on the best approach perhaps an option could be passed in to support this if its a requirement for others.
In otel for node it uses async local storage to wrap functions to propagate the context, this PR introduces the wrapping of the different phases of GraphQL so that any span has the correct trace parent. This can be seen when using http instrumentation, because the execute phase is wrapped using api.with the subsequent fetch will have the correct trace parent.
Questions to answer.
TODO: