fix(ACI): Error when updating non-Anomaly Detection to Anomaly Detection#117947
1 issue
sentry-backend-bugs: Found 1 issue (1 medium)
Medium
Uncaught DetectorException (500) when converting a detector to anomaly detection without an ANOMALY_DETECTION condition in the payload - `src/sentry/incidents/metric_issue_detector.py:44`
In update_anomaly_detection (metric_issue_detector.py), when a detector is converted to DYNAMIC, the new in-memory DataCondition is only built when the payload actually contains an ANOMALY_DETECTION condition. MetricIssueConditionGroupValidator.validate_conditions only requires either an OK-result condition or an ANOMALY_DETECTION condition, so a detection_type=DYNAMIC payload such as [GREATER(HIGH), MEDIUM] or [GREATER(HIGH), OK] passes validation with no ANOMALY_DETECTION condition. In that case data_condition is None, so update_detector_data falls back to _fetch_anomaly_data_condition(detector, 'update'), which reads the detector's previous (still-persisted, e.g. static) conditions. That DataCondition.objects.get(condition_result__in=[HIGH, MEDIUM]) raises DataCondition.DoesNotExist or MultipleObjectsReturned, which is re-raised as DetectorException. DetectorException subclasses Exception (not ValidationError), so neither the except (TimeoutError, MaxRetryError, ParseError, ValidationError) block in update_detector_data nor the one in update_anomaly_detection catches it, and it propagates to the API layer as an unhandled 500. This is the same class of failure the PR set out to fix, but the fix is incomplete for the no-anomaly-condition subcase.
Also found at:
src/sentry/incidents/metric_issue_detector.py:437src/sentry/seer/anomaly_detection/store_data_workflow_engine.py:95-98
⏱ 5m 38s · 593.2k in / 49.1k out · $1.78
Annotations
Check warning on line 44 in src/sentry/incidents/metric_issue_detector.py
sentry-warden / warden: sentry-backend-bugs
Uncaught DetectorException (500) when converting a detector to anomaly detection without an ANOMALY_DETECTION condition in the payload
In `update_anomaly_detection` (metric_issue_detector.py), when a detector is converted to `DYNAMIC`, the new in-memory `DataCondition` is only built when the payload actually contains an `ANOMALY_DETECTION` condition. `MetricIssueConditionGroupValidator.validate_conditions` only requires *either* an OK-result condition *or* an ANOMALY_DETECTION condition, so a `detection_type=DYNAMIC` payload such as `[GREATER(HIGH), MEDIUM]` or `[GREATER(HIGH), OK]` passes validation with no ANOMALY_DETECTION condition. In that case `data_condition` is `None`, so `update_detector_data` falls back to `_fetch_anomaly_data_condition(detector, 'update')`, which reads the detector's *previous* (still-persisted, e.g. static) conditions. That `DataCondition.objects.get(condition_result__in=[HIGH, MEDIUM])` raises `DataCondition.DoesNotExist` or `MultipleObjectsReturned`, which is re-raised as `DetectorException`. `DetectorException` subclasses `Exception` (not `ValidationError`), so neither the `except (TimeoutError, MaxRetryError, ParseError, ValidationError)` block in `update_detector_data` nor the one in `update_anomaly_detection` catches it, and it propagates to the API layer as an unhandled 500. This is the same class of failure the PR set out to fix, but the fix is incomplete for the no-anomaly-condition subcase.
Check warning on line 437 in src/sentry/incidents/metric_issue_detector.py
sentry-warden / warden: sentry-backend-bugs
[ZMC-F3B] Uncaught DetectorException (500) when converting a detector to anomaly detection without an ANOMALY_DETECTION condition in the payload (additional location)
In `update_anomaly_detection` (metric_issue_detector.py), when a detector is converted to `DYNAMIC`, the new in-memory `DataCondition` is only built when the payload actually contains an `ANOMALY_DETECTION` condition. `MetricIssueConditionGroupValidator.validate_conditions` only requires *either* an OK-result condition *or* an ANOMALY_DETECTION condition, so a `detection_type=DYNAMIC` payload such as `[GREATER(HIGH), MEDIUM]` or `[GREATER(HIGH), OK]` passes validation with no ANOMALY_DETECTION condition. In that case `data_condition` is `None`, so `update_detector_data` falls back to `_fetch_anomaly_data_condition(detector, 'update')`, which reads the detector's *previous* (still-persisted, e.g. static) conditions. That `DataCondition.objects.get(condition_result__in=[HIGH, MEDIUM])` raises `DataCondition.DoesNotExist` or `MultipleObjectsReturned`, which is re-raised as `DetectorException`. `DetectorException` subclasses `Exception` (not `ValidationError`), so neither the `except (TimeoutError, MaxRetryError, ParseError, ValidationError)` block in `update_detector_data` nor the one in `update_anomaly_detection` catches it, and it propagates to the API layer as an unhandled 500. This is the same class of failure the PR set out to fix, but the fix is incomplete for the no-anomaly-condition subcase.
Check warning on line 98 in src/sentry/seer/anomaly_detection/store_data_workflow_engine.py
sentry-warden / warden: sentry-backend-bugs
[ZMC-F3B] Uncaught DetectorException (500) when converting a detector to anomaly detection without an ANOMALY_DETECTION condition in the payload (additional location)
In `update_anomaly_detection` (metric_issue_detector.py), when a detector is converted to `DYNAMIC`, the new in-memory `DataCondition` is only built when the payload actually contains an `ANOMALY_DETECTION` condition. `MetricIssueConditionGroupValidator.validate_conditions` only requires *either* an OK-result condition *or* an ANOMALY_DETECTION condition, so a `detection_type=DYNAMIC` payload such as `[GREATER(HIGH), MEDIUM]` or `[GREATER(HIGH), OK]` passes validation with no ANOMALY_DETECTION condition. In that case `data_condition` is `None`, so `update_detector_data` falls back to `_fetch_anomaly_data_condition(detector, 'update')`, which reads the detector's *previous* (still-persisted, e.g. static) conditions. That `DataCondition.objects.get(condition_result__in=[HIGH, MEDIUM])` raises `DataCondition.DoesNotExist` or `MultipleObjectsReturned`, which is re-raised as `DetectorException`. `DetectorException` subclasses `Exception` (not `ValidationError`), so neither the `except (TimeoutError, MaxRetryError, ParseError, ValidationError)` block in `update_detector_data` nor the one in `update_anomaly_detection` catches it, and it propagates to the API layer as an unhandled 500. This is the same class of failure the PR set out to fix, but the fix is incomplete for the no-anomaly-condition subcase.