-
Notifications
You must be signed in to change notification settings - Fork 48
Add Google ADK sample agent A365 observability wiring (PoC) #326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
evanmitchellgithub
wants to merge
6
commits into
main
Choose a base branch
from
users/evanmitchell/google-adk-observability
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 3 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
2adcd72
Add Google ADK sample agent A365 observability wiring (PoC)
evanmitchellmicrosoft e08b7b5
Remove temporary OTLP egress diagnostics and bump vulnerable deps
evanmitchellmicrosoft bc5ac06
Pin cryptography==48.0.1 (fixes GHSA-537c-gmf6-5ccf, stays <49 for py…
evanmitchellmicrosoft e629b13
Address review: thread-safe token cache with TTL eviction; add licens…
evanmitchellmicrosoft f5770ed
Remove _freeze.py dev-only requirements generator from sample
evanmitchellmicrosoft c27b914
docs: document Cloud Run deploy script and A365 observability exporte…
evanmitchellmicrosoft File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # Cloud Run source upload allowlist. | ||
| # Ignore everything, then re-include only what the container needs at build/run. | ||
| # This forces the pip buildpack (requirements.txt) and avoids uploading the | ||
| # uv.lock (stale, missing mcp), the local .venv, secrets (.env), and a365 config. | ||
| * | ||
|
|
||
| !*.py | ||
| !Procfile | ||
| !requirements.txt | ||
| !.python-version | ||
| !ToolingManifest.json | ||
|
|
||
| # Re-exclude helper/local-only python files matched by !*.py above. | ||
| _freeze.py |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| 3.13 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| web: python main.py |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| import importlib.metadata as m | ||
|
|
||
| skip = {"sample-google-adk", "pip", "setuptools", "wheel"} | ||
| # Windows-only packages present in the local venv that have no Linux build. | ||
| skip |= {"pywin32", "pywin32-ctypes", "pypiwin32", "pywinpty", "winsdk", "windows-curses"} | ||
| lines = [] | ||
| for d in m.distributions(): | ||
| name = d.metadata["Name"] | ||
| if not name or name.lower() in skip: | ||
| continue | ||
| lines.append(f"{name}=={d.version}") | ||
| lines = sorted(set(lines), key=str.lower) | ||
| with open("requirements.txt", "w", encoding="utf-8") as f: | ||
| f.write("\n".join(lines) + "\n") | ||
| print("count", len(lines)) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| # Deploy the Google ADK A365 sample agent to GCP Cloud Run. | ||
| # | ||
| # Reads non-secret + secret env from the local .env (gitignored), applies the | ||
| # production overrides needed for the A365 observability exporter, and deploys | ||
| # via Cloud Run source buildpacks (Procfile -> `python main.py`). | ||
| # | ||
| # Cloud Run automatically injects PORT and K_SERVICE; main.py reads both, so the | ||
| # JWT middleware + production host binding engage with no extra config. | ||
| # | ||
| # Usage: | ||
| # .\deploy-cloudrun.ps1 -ProjectId <gcp-project-id> [-Region us-central1] [-ServiceName gcp-a365-agent] | ||
|
|
||
| param( | ||
| [Parameter(Mandatory = $true)] [string] $ProjectId, | ||
| [string] $Region = "us-central1", | ||
| [string] $ServiceName = "gcp-a365-agent" | ||
| ) | ||
|
|
||
| $ErrorActionPreference = "Stop" | ||
| Set-Location $PSScriptRoot | ||
|
|
||
| if (-not (Test-Path ".env")) { throw ".env not found in $PSScriptRoot" } | ||
|
|
||
| # Production overrides applied on top of .env. PORT is intentionally omitted | ||
| # (Cloud Run sets it). AUTH_HANDLER_NAME=AGENTIC turns on agentic token exchange. | ||
| $overrides = [ordered]@{ | ||
| "AUTH_HANDLER_NAME" = "AGENTIC" | ||
| "ENABLE_OBSERVABILITY" = "true" | ||
| "ENABLE_A365_OBSERVABILITY_EXPORTER" = "true" | ||
| "PYTHON_ENVIRONMENT" = "production" | ||
| } | ||
|
|
||
| # Parse .env into an ordered map (skip comments, blanks, and PORT). | ||
| $envMap = [ordered]@{} | ||
| foreach ($line in Get-Content ".env") { | ||
| $trimmed = $line.Trim() | ||
| if ($trimmed -eq "" -or $trimmed.StartsWith("#")) { continue } | ||
| $idx = $trimmed.IndexOf("=") | ||
| if ($idx -lt 1) { continue } | ||
| $key = $trimmed.Substring(0, $idx).Trim() | ||
| $val = $trimmed.Substring($idx + 1).Trim() | ||
| if ($key -eq "PORT") { continue } | ||
| $envMap[$key] = $val | ||
| } | ||
| foreach ($k in $overrides.Keys) { $envMap[$k] = $overrides[$k] } | ||
|
|
||
| # Build the env-vars string using a custom delimiter (^##^) so values containing | ||
| # commas, slashes, colons, etc. are passed verbatim to gcloud. | ||
| $pairs = @() | ||
| foreach ($k in $envMap.Keys) { $pairs += "$k=$($envMap[$k])" } | ||
| $envArg = "^##^" + ($pairs -join "##") | ||
|
|
||
| Write-Host "Deploying '$ServiceName' to project '$ProjectId' ($Region) with $($envMap.Count) env vars..." -ForegroundColor Cyan | ||
|
|
||
| # --no-cpu-throttling (CPU always allocated) is REQUIRED: the OTel BatchSpanProcessor | ||
| # exports genAI spans on a background thread AFTER the turn returns. With default CPU | ||
| # throttling, that thread wakes on a frozen CPU and its TLS read stalls -> the gateway | ||
| # drops the connection (SSL UNEXPECTED_EOF_WHILE_READING) and spans are lost. | ||
| gcloud run deploy $ServiceName ` | ||
| --source . ` | ||
| --project $ProjectId ` | ||
| --region $Region ` | ||
| --platform managed ` | ||
| --allow-unauthenticated ` | ||
| --no-cpu-throttling ` | ||
| --set-env-vars $envArg | ||
|
|
||
| if ($LASTEXITCODE -ne 0) { throw "gcloud run deploy failed (exit $LASTEXITCODE)" } | ||
|
|
||
| $url = gcloud run services describe $ServiceName --project $ProjectId --region $Region --format "value(status.url)" | ||
| Write-Host "" | ||
| Write-Host "Deployed. Service URL: $url" -ForegroundColor Green | ||
| Write-Host "Messaging endpoint: $url/api/messages" -ForegroundColor Green | ||
| Write-Host "" | ||
| Write-Host "Next: set messagingEndpoint in a365.config.json to the above, then run 'a365 setup all'." -ForegroundColor Yellow |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| # Copyright (c) Microsoft Corporation. | ||
| # Licensed under the MIT License. | ||
|
|
||
| """ | ||
| Span operation-name remap for the Google ADK sample agent. | ||
|
|
||
| Google ADK auto-instrumentation tags the LLM call with | ||
| ``gen_ai.operation.name = "generate_content"`` (the Google GenAI semantic | ||
| convention). Microsoft Agent 365 / Maven ingestion only accepts four | ||
| operation names — ``invoke_agent``, ``execute_tool``, ``chat`` and | ||
| ``output_messages`` — and drops every other span before fan-out. As a | ||
| result the ADK inference span (model, token usage, finish reason) never | ||
| reaches Maven. | ||
|
|
||
| This module rewrites ``generate_content`` -> ``chat`` on export using the | ||
| A365 observability SDK's public enricher hook (``register_span_enricher``), | ||
| so the inference span maps onto Maven's InferenceCall table. The original | ||
| span is never mutated; an :class:`EnrichedReadableSpan` overlay is returned | ||
| with the single attribute overridden. | ||
|
|
||
| No changes to the A365 SDK or to Maven are required. | ||
| """ | ||
|
|
||
| import logging | ||
|
|
||
| from opentelemetry.sdk.trace import ReadableSpan | ||
|
|
||
| from microsoft_agents_a365.observability.core.constants import ( | ||
| CHAT_OPERATION_NAME, | ||
| EXECUTE_TOOL_OPERATION_NAME, | ||
| GEN_AI_OPERATION_NAME_KEY, | ||
| INVOKE_AGENT_OPERATION_NAME, | ||
| OUTPUT_MESSAGES_OPERATION_NAME, | ||
| ) | ||
| from microsoft_agents_a365.observability.core.exporters.enriched_span import ( | ||
| EnrichedReadableSpan, | ||
| ) | ||
| from microsoft_agents_a365.observability.core.exporters.enriching_span_processor import ( | ||
| get_span_enricher, | ||
| register_span_enricher, | ||
| unregister_span_enricher, | ||
| ) | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| # The Google GenAI semantic-convention operation name (emitted only when the | ||
| # optional ``opentelemetry-instrumentation-google-genai`` package is installed). | ||
| _SOURCE_OPERATION_NAME = "generate_content" | ||
|
|
||
| # Attribute set by ADK's ``trace_call_llm`` on the inference span. ADK does NOT | ||
| # set ``gen_ai.operation.name`` on this span, so without a remap it is dropped by | ||
| # the Agent 365 exporter (which only keeps invoke_agent/execute_tool/chat/ | ||
| # output_messages). Presence of this attribute identifies an inference call. | ||
| _GEN_AI_REQUEST_MODEL_KEY = "gen_ai.request.model" | ||
|
|
||
| # Operation names the Agent 365 exporter already considers eligible. A span that | ||
| # already carries one of these must never be relabelled. | ||
| _RECOGNIZED_OPERATION_NAMES = frozenset( | ||
| { | ||
| INVOKE_AGENT_OPERATION_NAME, | ||
| EXECUTE_TOOL_OPERATION_NAME, | ||
| OUTPUT_MESSAGES_OPERATION_NAME, | ||
| CHAT_OPERATION_NAME, | ||
| } | ||
| ) | ||
|
|
||
|
|
||
| def _remap_generate_content_to_chat(span: ReadableSpan) -> ReadableSpan: | ||
| """Map an ADK / Google GenAI inference span onto the ``chat`` operation. | ||
|
|
||
| Two shapes are handled: | ||
|
|
||
| 1. ``gen_ai.operation.name == "generate_content"`` — emitted by the optional | ||
| ``opentelemetry-instrumentation-google-genai`` package. | ||
| 2. ADK's own ``call_llm`` span, which sets ``gen_ai.request.model`` but no | ||
| ``gen_ai.operation.name`` at all. This is the default for google-adk | ||
| without the genai instrumentation package, and is the case in this | ||
| sample. | ||
|
|
||
| Any span that already carries a recognized operation name (invoke_agent, | ||
| execute_tool, chat, output_messages) is returned unchanged. | ||
| """ | ||
| attributes = span.attributes or {} | ||
| operation_name = attributes.get(GEN_AI_OPERATION_NAME_KEY) | ||
|
|
||
| # Never relabel a span that already has an eligible operation name. | ||
| if operation_name in _RECOGNIZED_OPERATION_NAMES: | ||
| return span | ||
|
|
||
| is_genai_generate_content = operation_name == _SOURCE_OPERATION_NAME | ||
| is_adk_inference_span = ( | ||
| operation_name is None | ||
| and attributes.get(_GEN_AI_REQUEST_MODEL_KEY) is not None | ||
| ) | ||
| if not (is_genai_generate_content or is_adk_inference_span): | ||
| return span | ||
|
|
||
| return EnrichedReadableSpan( | ||
| span, | ||
| extra_attributes={GEN_AI_OPERATION_NAME_KEY: CHAT_OPERATION_NAME}, | ||
| ) | ||
|
|
||
|
|
||
| def register_generate_content_remap() -> None: | ||
| """Register the ``generate_content`` -> ``chat`` enricher with the SDK. | ||
|
|
||
| The SDK allows a single enricher at a time. If another enricher is already | ||
| registered (e.g. a platform instrumentor), this composes with it: the | ||
| existing enricher runs first, then the remap is applied to its result. | ||
| Safe to call once during application startup, after ``configure()``. | ||
| """ | ||
| existing = get_span_enricher() | ||
|
|
||
| if existing is None: | ||
| enricher = _remap_generate_content_to_chat | ||
| else: | ||
| def enricher(span: ReadableSpan) -> ReadableSpan: | ||
| return _remap_generate_content_to_chat(existing(span)) | ||
|
|
||
| # Replace the existing single-slot enricher with the composed one. | ||
| unregister_span_enricher() | ||
|
|
||
| register_span_enricher(enricher) | ||
| logger.info( | ||
| "Registered span enricher: %s -> %s remap", | ||
| _SOURCE_OPERATION_NAME, | ||
| CHAT_OPERATION_NAME, | ||
| ) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.