Skip to content

feat(tasks) Add taskbroker-client metrics for tasks#118052

Merged
markstory merged 8 commits into
masterfrom
feat-taskworker-metrics
Jun 19, 2026
Merged

feat(tasks) Add taskbroker-client metrics for tasks#118052
markstory merged 8 commits into
masterfrom
feat-taskworker-metrics

Conversation

@markstory

Copy link
Copy Markdown
Member

We want to standardize metrics prefixes and tags for taskworkers. Having standardized metrics makes building dashboards and alerting easier for platform teams, and makes documenting metrics for others simpler.

I've put usage of the new metrics backend behind an env var so that I can incrementally enable and validate the new metrics.

Refs STREAM-1174

We want to standardize metrics prefixes and tags for taskworkers. Having
standardized metrics makes building dashboards and alerting easier for
platform teams, and makes documenting metrics for others simpler.

I've put usage of the new metrics backend behind an env var so that
I can incrementally enable and validate the new metrics.

Refs STREAM-1174
@markstory markstory requested review from a team as code owners June 18, 2026 19:20
@linear-code

linear-code Bot commented Jun 18, 2026

Copy link
Copy Markdown

STREAM-1174

@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Jun 18, 2026
Comment thread src/sentry/taskworker/runtime.py Outdated
Comment thread src/sentry/taskworker/runtime.py Outdated
Comment thread src/sentry/taskworker/runtime.py Outdated
metrics_class = DatadogMetrics(
application="sentry",
statsd_host=os.getenv("HOST_IP", "127.0.0.1"),
statsd_port=os.getenv("SENTRY_STATSD_PORT", 8126),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we try to funnel this through our existing django settings? having the envvar be read in many places is not great

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. I'm not thrilled about the extra env var for the port either. Sadly our production metrics are all configured through getsentry and not env vars. I'll rework this to work with the production & self-hosted django settings.

Comment thread src/sentry/taskworker/runtime.py Outdated
statsd_port=port,
sample_rate=settings.SENTRY_METRICS_SAMPLE_RATE,
enable_prefixed_metrics=True,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Env enabled, metrics silently unchanged

Medium Severity

With USE_TASKWORKER_METRICS=1, if _extract_metrics_config returns a missing host or port (empty options, UDS-only StatsD, dummy backend), the code keeps SentryMetricsBackend and emits no warning. Rollout validation can look enabled while taskworker still uses the old metrics shape.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 6fc270b. Configure here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have options defined in all the environments, and if extracting configuration fails we'll get a log message.

Comment thread src/sentry/taskworker/runtime.py
metric_options = settings.SENTRY_METRICS_OPTIONS["primary_backend_args"]

# Some backends use `host` and others use `statsd_host`
host = metric_options.get("statsd_host", None) or metric_options.get("host", None)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I regret asking for this. but i think architecturally it's probably still better?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a bit gnarly, but better than having more environment variables.

Comment thread src/sentry/taskworker/runtime.py
Comment thread src/sentry/taskworker/runtime.py
Comment thread src/sentry/taskworker/runtime.py
Comment thread src/sentry/taskworker/runtime.py Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 785bf98. Configure here.


# Some backends use `host` and others use `statsd_host`
host = metric_options.get("statsd_host", None) or metric_options.get("host", None)
raw_port = metric_options.get("statsd_port", None) or metric_options.get("port", None)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Host fallback misroutes StatsD

Medium Severity

When USE_TASKWORKER_METRICS is enabled, _extract_metrics_config falls back from statsd_host to host. In DatadogMetricsBackend, host is the metric reporting hostname, not the DogStatsD agent. If statsd_host is omitted but host and statsd_port are set, taskworker DatadogMetrics can send to the wrong address while Sentry’s main metrics backend still uses the default agent.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 785bf98. Configure here.

@markstory markstory merged commit 843db44 into master Jun 19, 2026
64 checks passed
@markstory markstory deleted the feat-taskworker-metrics branch June 19, 2026 15:24
@ceorourke

ceorourke commented Jun 22, 2026

Copy link
Copy Markdown
Member

fyi I am seeing failures when trying to run any migration commands locally after this was merged, e.g. from sentry django migrate sentry 1234

  File "/Users/ceorourke/code/sentry/src/sentry/eventstream/__init__.py", line 5, in <module>
    from .base import EventStream
  File "/Users/ceorourke/code/sentry/src/sentry/eventstream/base.py", line 9, in <module>
    from sentry.tasks.post_process import post_process_group
  File "/Users/ceorourke/code/sentry/src/sentry/tasks/post_process.py", line 28, in <module>
    from sentry.taskworker.namespaces import ingest_errors_postprocess_tasks
  File "/Users/ceorourke/code/sentry/src/sentry/taskworker/namespaces.py", line 3, in <module>
    from sentry.taskworker.runtime import app
  File "/Users/ceorourke/code/sentry/src/sentry/taskworker/runtime.py", line 7, in <module>
    from taskbroker_client.metrics import DatadogMetrics, MetricsBackend
ImportError: cannot import name 'DatadogMetrics' from 'taskbroker_client.metrics' (/Users/ceorourke/code/sentry/.venv/lib/python3.13/site-packages/taskbroker_client/metrics.py)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants