Skip to content

feat(reports): Cache per-project weekly report metrics#116739

Merged
amy-chen23 merged 21 commits into
masterfrom
amyc/cache-weekly-report-metrics
Jun 10, 2026
Merged

feat(reports): Cache per-project weekly report metrics#116739
amy-chen23 merged 21 commits into
masterfrom
amyc/cache-weekly-report-metrics

Conversation

@amy-chen23

@amy-chen23 amy-chen23 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Resolves ID-1589.

Goal: Cache per-project total errors and total transactions during weekly report generation so the frontend can display week-over-week percentage change without re-querying Snuba.

Redis Cache Layer

  • Caches total errors and total transactions by Org Id, Project Id (no timestamp)
  • 10 day TTL

@linear-code

linear-code Bot commented Jun 2, 2026

Copy link
Copy Markdown

ID-1589

@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Jun 2, 2026
@github-actions github-actions Bot added the Scope: Frontend Automatically applied to PRs that change frontend components label Jun 2, 2026
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

Comment thread src/sentry/api/endpoints/organization_weekly_report_metrics.py Outdated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only frontend change -- auto-added by santry bot

@amy-chen23 amy-chen23 closed this Jun 8, 2026
@amy-chen23 amy-chen23 reopened this Jun 8, 2026
@amy-chen23 amy-chen23 changed the title feat(reports): Cache per-project weekly report metrics and expose via API feat(reports): Cache per-project weekly report metrics Jun 8, 2026
@amy-chen23 amy-chen23 requested a review from a team June 8, 2026 21:37
@amy-chen23 amy-chen23 removed the Scope: Frontend Automatically applied to PRs that change frontend components label Jun 8, 2026
@JoshFerge

Copy link
Copy Markdown
Member

just out of curiosity, what's the napkin math on how much cache space this will use?

@amy-chen23

amy-chen23 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

just out of curiosity, what's the napkin math on how much cache space this will use?

good question! there are around 140-150 bytes per entry. Not sure if the number of projects is public 👀 , but since TTL is 14 days, the amount of sets could be 2 * num_of_projects at any given time.

so cache space would be 140 * 2 * num_of_projects. It's not trivial, so the alternative is storing in a DB, but that's more overhead/setup

@amy-chen23 amy-chen23 marked this pull request as ready for review June 8, 2026 21:53
@amy-chen23 amy-chen23 requested review from a team as code owners June 8, 2026 21:54
Comment thread src/sentry/tasks/summaries/weekly_reports.py Outdated
@shashjar

shashjar commented Jun 8, 2026

Copy link
Copy Markdown
Member

we currently have ~3M projects in US, so that works out to <1 gig of cache usage

logger.exception(
"weekly_reports.cache_project_metrics.failed",
extra={"organization_id": organization_id, "timestamp": timestamp},
)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might not need this logger?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah think we can remove it

@@ -218,15 +242,17 @@ def prepare_organization_report(
lifecycle.record_halt(WeeklyReportHaltReason.EMPTY_REPORT)
return

# Cache per-project metrics for reuse across report batches
_cache_project_metrics(ctx, organization_id, timestamp)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up to you, but maybe this should just call a top-level function in src/sentry/tasks/summaries/weekly_report_cache.py directly

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree -- moved this logic into src/sentry/tasks/summaries/weekly_report_cache.py

logger.exception(
"weekly_reports.cache_project_metrics.failed",
extra={"organization_id": organization_id, "timestamp": timestamp},
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah think we can remove it

from sentry.utils import json, metrics, redis
from sentry.utils.dates import floor_to_utc_day, to_datetime

CACHE_TTL = timedelta(days=14)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this 10 days to prevent overlap of cache usage across 3 weeks?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, you can define this as an int in number of seconds, e.g. 10 * 24 * 60 * 60, so that you don't have to do that conversion below



def _make_cache_key(org_id: int, project_id: int, timestamp: float) -> str:
return f"{KEY_PREFIX}:{org_id}:{project_id}:{timestamp}"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

including a timestamp in the cache key doesn't seem necessary, conceptually we should just be caching the latest value we computed for the org ID/project ID

@@ -218,15 +242,17 @@ def prepare_organization_report(
lifecycle.record_halt(WeeklyReportHaltReason.EMPTY_REPORT)
return

# Cache per-project metrics for reuse across report batches

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we put this behind an org feature flag to begin with?

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 59edfc4. Configure here.

project_metrics[project_id] = {
"e": project_ctx.accepted_error_count,
"t": project_ctx.accepted_transaction_count,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale metrics after empty week

Medium Severity

cache_project_metrics skips Redis writes when check_if_project_is_empty() is true, so a zero-activity week leaves the prior week’s totals in cache. Week-over-week reads then treat old counts as the previous period instead of zero, and a cache hit avoids the intended Snuba fallback on miss.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 59edfc4. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 day TTL avoids this. Since the report runs weekly, any stale entry expires before the next read

Comment thread src/sentry/tasks/summaries/weekly_reports.py Outdated


def cache_project_metrics(ctx: OrganizationReportContext, organization_id: int) -> None:
project_metrics: dict[int, dict[str, int]] = {}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: add feature flag

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVM -- not adding feature flag. We'll be adding feature flag to the PR that actually uses the cached metrics

Comment thread src/sentry/tasks/summaries/weekly_report_cache.py Outdated
…hat weekly_reports_cache is a pure cache layer
Comment thread src/sentry/tasks/summaries/weekly_reports.py Outdated
@amy-chen23 amy-chen23 requested a review from shashjar June 10, 2026 16:41

@shashjar shashjar left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two non-blocking comments, nice job

raw = results[i]
if raw is None:
metrics.incr("weekly_report.cache.miss")
else:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we record cache hits as well? another way you could do this would be to have a single metric called weekly_report.cache_read and then a tag on the metric is set to either "miss" or "hit" based on the result, and then DD allows you to break down those timeseries

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be part of your next PR btw, doesn't have to block this one


from sentry.utils import json, metrics, redis

CACHE_TTL = timedelta(days=10)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can just set this as an int, e.g. 10 * 24 * 60 * 60

CACHE_TTL_SEC = 10 * 24 * 60 * 60  # 10 days

@amy-chen23 amy-chen23 merged commit b658c3d into master Jun 10, 2026
64 checks passed
@amy-chen23 amy-chen23 deleted the amyc/cache-weekly-report-metrics branch June 10, 2026 18:31
amy-chen23 added a commit that referenced this pull request Jun 10, 2026
Adds cache hit metric to weekly report cache

More details on caching found in this [merged
PR](#116739)
amy-chen23 added a commit that referenced this pull request Jun 10, 2026
Resolves
[ID-1589](https://linear.app/getsentry/issue/ID-1589/cache-total-project-errors-and-total-project-transactions-for-weekly).

Goal: Cache per-project total errors and total transactions during
weekly report generation so the frontend can display week-over-week
percentage change without re-querying Snuba.

**Redis Cache Layer**
- Caches total errors and total transactions by Org Id, Project Id (no
timestamp)
- 10 day TTL

---------

Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
amy-chen23 added a commit that referenced this pull request Jun 10, 2026
Adds cache hit metric to weekly report cache

More details on caching found in this [merged
PR](#116739)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants