Skip to content

Commit

Permalink
Fix the feedbacks of Titus
Browse files Browse the repository at this point in the history
  • Loading branch information
baturayo committed Jan 18, 2024
1 parent 26fc1bb commit 692cfa5
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 2 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 5 additions & 2 deletions soda-cl/anomaly-detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,9 @@ The `window_length` parameter sets the number of historical measurements that So

When Soda collects more measurements than the automatically-detected or specified frequency, the `aggregation_function` parameter defines how Soda aggregates the data within each window. For example, if your frequency is hourly and your aggregation function is last and Soda collected two measurements for the same hour, Soda uses the most recent, or latest, measurement for that hour to gauge anomalies.

See the example below for a demonstration of how Soda aggregates the training data using the configurations.

![underfitting-coverage](/assets/images/ad-training-dataset-aggregation.png){:height="700px" width="700px"}

## Add optional model configurations

Expand Down Expand Up @@ -257,7 +260,7 @@ checks for dim_customer:

For each of these values, Soda has adjusted the values of a few of the model's hyperparameters to tailor its sensitivity to anomalies, particularly the `changepoint_prior_scale` and `seasonality_prior_scale` hyperparameters.

`coverage` refers to the concept of <a href="" target="_blnak">coverage </a> and represents the percentage of actual measurements within the model's predicted confidence intervals. For example, if a model forecasts sales between 10-20 units on certain days and 90 out of 100 actual sales figures fall within this range, the coverage is 90%. When predictions fall outside these intervals, as in the remaining 10 cases, the model flags them as anomalies. This coverage-optimized profile is more tolerant of small noises in the data that can lead to falsely-identified anomalies. However, it might underfit the data if there is a fluctuating pattern.
`coverage` refers to the concept of <a href="" target="_blnak">coverage </a> and represents the percentage of actual measurements within the model's predicted confidence intervals. For example, if a model forecasts a sales between 10-20 units on certain days and 90 out of 100 actual sales figures fall within this range, the coverage is 90%. This coverage-optimized profile is more tolerant of small noises in the data that can lead to falsely-identified anomalies since it has larger confidence intervals to cover as much as possible. However, the model might underfit the data if there is a fluctuating pattern.

For reference, the following lists the hyperparameters that Soda has set for the `coverage` profile.
```python
Expand Down Expand Up @@ -381,7 +384,7 @@ The `parameter_grid` hyperparameter is a dictionary that lists hyperparameters a

The execution time for dynamic hyperparameter tuning varies based on several factors including the number of hyperparameters and the number of folds. For example, the default hyperparameter grid has 16 combinations since `changepoint_prior_scale` and `seasonality_prior_scale` have four values each. Consider using a small number of hyperparameters to avoid long execution times. By default, the model processes each fold in parallel. If you use multiple cores, you can set the `parallel` parameter to `True` to speed up the execution time.

Use the following tables to estimate the execution time for checks with dynamic hypertuning.
Use the following tables to estimate the execution time for checks with dynamic hyperparameter tuning.

| Model Name | MacBook Pro |
| Model Identifier | MacBookPro18,3 |
Expand Down

0 comments on commit 692cfa5

Please sign in to comment.