-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTelemetry instrumentation for RBAC #771
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disclaimer: I am not very familiar with the rbac service and opentelemetry.
Have we done any performance testing around these changes? Is this in the hot path? I would think so, as we appear to be installing a web server middleware.
If something goes wrong, how can we quickly/easily disable this? Is it as simple as setting OTLP_ENABLED=false? Is there an impact on performance with OTLP_ENABLED=false?
If something goes really really wrong, we need to be able to roll this change back out. I would like to see this change rebased / squashed into a single commit to make it easier to revert this change if needed.
I think we need to add the config env vars to the deploy/rbac-clowdapp.yml?
This change seems to be logging/sending the access-policy (and page?) to the telemetry server. How big are the access-policies?
@@ -0,0 +1,395 @@ | |||
# Copyright The OpenTelemetry Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file appears to come from here, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return self.get_paginated_response(page) | ||
return Response({"data": access_policy}) | ||
|
||
tracer = trace.get_tracer(__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ok to use the trace/tracer object when tracing is disabled (OTLP_ENABLED=false)? I assume that is fine and the trace object just becomes a NullObject in that case.
@@ -16,3 +16,12 @@ API_PATH_PREFIX=/api/rbac | |||
|
|||
DEVELOPMENT=True | |||
# DJANGO_DEBUG=True | |||
|
|||
OTLP_ENABLED=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit-picky (i can't help it...i'm sorry!) I kind of like being overly verbose on naming these types of variables. Maybe something like OPEN_TELEMETRY_ENABLED? My brain wires up "OTLP" to Online Transaction Processing (OLTP vs OTLP). :)
|
||
tracer = trace.get_tracer(__name__) | ||
with tracer.start_as_current_span("access-get"): | ||
span = trace.get_current_span() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to get the span from the context manager in the with
clause instead of calling span = trace.get_current_span()
? Something like this:
with tracer.start_as_current_span("access-get") as span:
span.add_event("access_policy", {"policy": '-none-' if access_policy is None else access_policy}) | ||
if access_policy is None: | ||
# Wrap in a span of its own | ||
with tracer.start_as_current_span("access-policy-none"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more clear to name this span "access-policy-cache-miss" (if i understand the code correctly).
Like above it might be possible to get a reference to the child span as part of the with
clause?
/retest |
/rebuild |
@pilhuhn is this PR something we still want to get merged in? jw if its stale and can/should be closed or if some one should pick it back up |
@dagbay-rh you should probably talk to @gmcculloug. I personally think it needs to be updated and merged |
@lphiri is this process moving forward platform wide? @pilhuhn I would want to hold off on this until I knew what direction the platform was moving on the Open Telemetry front. |
Link(s) to Jira
Description of Intent of Change(s)
Support Distributed Tracing by instrumenting RBAC with OpenTelemetry instrumentation
Local Testing
How can the feature be exercised?
pip install
# Temporary patch until the Django2 instrumentation supports ASGI (or we move to Django 3+)
cp patch/otel_middleware.py ${APP_ROOT}/lib64/python3.8/site-packages/opentelemetry/instrumentation/django/middleware/
OTLP_ENABLED=True
, run rbac and see traces in JaegerDockerfile has the patching -- this is temporarily needed until we either move to Django3+ or open-telemetry/opentelemetry-python-contrib#1416 is fixed.
Checklist
Secure Coding Practices Checklist Link
Secure Coding Practices Checklist