cuebook
diff --git a/‎.gitbook/assets/AddConnection (1).png
76.1 KB b/‎.gitbook/assets/AddConnection (1).png
76.1 KB
diff --git a/‎.gitbook/assets/AddConnection.png
37.3 KB b/‎.gitbook/assets/AddConnection.png
37.3 KB
diff --git a/‎.gitbook/assets/Anomalies.png
134 KB b/‎.gitbook/assets/Anomalies.png
134 KB
diff --git a/‎.gitbook/assets/AnomalyCard_Daily.png
121 KB b/‎.gitbook/assets/AnomalyCard_Daily.png
121 KB
diff --git a/‎.gitbook/assets/AnomalyCard_Daily_cropped.png
129 KB b/‎.gitbook/assets/AnomalyCard_Daily_cropped.png
129 KB
diff --git a/‎.gitbook/assets/AnomalyCard_Hourly.png
145 KB b/‎.gitbook/assets/AnomalyCard_Hourly.png
145 KB
diff --git a/‎.gitbook/assets/AnomalyCard_Hourly_cropped.png
180 KB b/‎.gitbook/assets/AnomalyCard_Hourly_cropped.png
180 KB
diff --git a/‎.gitbook/assets/AnomalyDefinition_CueL.gif
821 KB b/‎.gitbook/assets/AnomalyDefinition_CueL.gif
821 KB
diff --git a/‎.gitbook/assets/AnomalyDefinitions (1).png
89.4 KB b/‎.gitbook/assets/AnomalyDefinitions (1).png
89.4 KB
diff --git a/‎.gitbook/assets/AnomalyDefinitions.png
89.4 KB b/‎.gitbook/assets/AnomalyDefinitions.png
89.4 KB
diff --git a/‎.gitbook/assets/AnomalyDeviation.png
86.8 KB b/‎.gitbook/assets/AnomalyDeviation.png
86.8 KB
diff --git a/‎.gitbook/assets/Dataset_Mapping_cropped (1).png
60.5 KB b/‎.gitbook/assets/Dataset_Mapping_cropped (1).png
60.5 KB
diff --git a/‎.gitbook/assets/Dataset_Mapping_cropped.png
60.5 KB b/‎.gitbook/assets/Dataset_Mapping_cropped.png
60.5 KB
diff --git a/‎.gitbook/assets/Dataset_SQL.png
125 KB b/‎.gitbook/assets/Dataset_SQL.png
125 KB
diff --git a/‎.gitbook/assets/Dataset_SQL_cropped (1).png
43.7 KB b/‎.gitbook/assets/Dataset_SQL_cropped (1).png
43.7 KB
diff --git a/‎.gitbook/assets/Dataset_SQL_cropped.png
43.7 KB b/‎.gitbook/assets/Dataset_SQL_cropped.png
43.7 KB
diff --git a/‎.gitbook/assets/MinAvgValue.png
11.6 KB b/‎.gitbook/assets/MinAvgValue.png
11.6 KB
diff --git a/‎.gitbook/assets/MinContribution.png
11.9 KB b/‎.gitbook/assets/MinContribution.png
11.9 KB
diff --git a/‎.gitbook/assets/Overview.gif
149 KB b/‎.gitbook/assets/Overview.gif
149 KB
diff --git a/‎.gitbook/assets/Overview_Anomaly (1).png
84.6 KB b/‎.gitbook/assets/Overview_Anomaly (1).png
84.6 KB
diff --git a/‎.gitbook/assets/Overview_Anomaly.png
84.6 KB b/‎.gitbook/assets/Overview_Anomaly.png
84.6 KB
diff --git a/‎.gitbook/assets/Overview_RCA (1).png
106 KB b/‎.gitbook/assets/Overview_RCA (1).png
106 KB
diff --git a/‎.gitbook/assets/Overview_RCA.png
106 KB b/‎.gitbook/assets/Overview_RCA.png
106 KB
diff --git a/‎.gitbook/assets/RCA_Analyze.png
103 KB b/‎.gitbook/assets/RCA_Analyze.png
103 KB
diff --git a/‎.gitbook/assets/RCA_Logs.png
39.1 KB b/‎.gitbook/assets/RCA_Logs.png
39.1 KB
diff --git a/‎.gitbook/assets/RCA_Result.png
110 KB b/‎.gitbook/assets/RCA_Result.png
110 KB
diff --git a/‎.gitbook/assets/Screenshot from 2021-08-26 17-52-09.png
54.9 KB b/‎.gitbook/assets/Screenshot from 2021-08-26 17-52-09.png
54.9 KB
diff --git a/‎.gitbook/assets/TopN.png
10.7 KB b/‎.gitbook/assets/TopN.png
10.7 KB
diff --git a/‎.gitbook/assets/cueObserve.png
18.1 KB b/‎.gitbook/assets/cueObserve.png
18.1 KB
diff --git a/‎.gitbook/assets/new.png
4.1 KB b/‎.gitbook/assets/new.png
4.1 KB
diff --git a/‎README.md
+38-43 b/‎README.md
+38-43
diff --git a/‎anomalies.md
+3-3 b/‎anomalies.md
+3-3
diff --git a/‎anomaly-definitions.md
+7-7 b/‎anomaly-definitions.md
+7-7
diff --git a/‎datasets.md
+13-2 b/‎datasets.md
+13-2
diff --git a/‎development.md
+8-8 b/‎development.md
+8-8
diff --git a/‎getting-started.md
+8-20 b/‎getting-started.md
+8-20
@@ -1,78 +1,73 @@
-<p align="center">
-  <a href="https://cueobserve.cuebook.ai" target="_blank">
-    <img alt="CueObserve Logo" width="300" src="docs/images/cueObserve.png">
-  </a>
-</p>
-<p align="center">
-  <a href="https://codeclimate.com/github/cuebook/CueObserve/maintainability"><img src="https://api.codeclimate.com/v1/badges/a70e071b59d5dbc38846/maintainability" /></a>
-  <a href="https://codeclimate.com/github/cuebook/CueObserve/test_coverage"><img src="https://api.codeclimate.com/v1/badges/a70e071b59d5dbc38846/test_coverage" /></a>
-  <a href="https://github.com/cuebook/cueobserve/actions/workflows/pr_checks.yml">
-    <img src="https://github.com/cuebook/cueobserve/actions/workflows/pr_checks.yml/badge.svg" alt="Test Coverage">
-  </a>
-  <a href="https://github.com/cuebook/cueobserve/blob/main/LICENSE.md">
-    <img src="https://img.shields.io/github/license/cuebook/cueobserve" alt="License">
-  </a>
-</p>
-<br>
+# Overview
+
+[![CueObserve Logo](.gitbook/assets/cueObserve.png)](https://cueobserve.cuebook.ai)
+
+[![](https://api.codeclimate.com/v1/badges/a70e071b59d5dbc38846/maintainability)](https://codeclimate.com/github/cuebook/CueObserve/maintainability) [![](https://api.codeclimate.com/v1/badges/a70e071b59d5dbc38846/test\_coverage)](https://codeclimate.com/github/cuebook/CueObserve/test\_coverage) [![Test Coverage](https://github.com/cuebook/cueobserve/actions/workflows/pr\_checks.yml/badge.svg) ](https://github.com/cuebook/cueobserve/actions/workflows/pr\_checks.yml)[![License](https://img.shields.io/github/license/cuebook/cueobserve)](https://github.com/cuebook/cueobserve/blob/main/LICENSE.md)
+
+
 
 CueObserve helps you monitor your metrics. Know when, where, and why a metric isn't right.
 
 CueObserve uses **timeseries Anomaly detection** to find **where** and **when** a metric isn't right. It then offers **one-click Root Cause analysis** so that you know **why** a metric isn't right.
 
 CueObserve works with data in your SQL data warehouses and databases. It currently supports Snowflake, BigQuery, Redshift, Druid, Postgres, MySQL, SQL Server and ClickHouse.
 
+![CueObserve Anomaly](<.gitbook/assets/Overview\_Anomaly (1).png>) ![CueObserve RCA](<.gitbook/assets/Overview\_RCA (1).png>)
 
-![CueObserve Anomaly](docs/images/Overview_Anomaly.png)
-![CueObserve RCA](docs/images/Overview_RCA.png)
+### Getting Started
 
-
-## Getting Started
 Install via Docker
 
 ```
 wget https://raw.githubusercontent.com/cuebook/CueObserve/latest_release/docker-compose.yml -q -O cueobserve-docker-compose.yml
 docker-compose -f cueobserve-docker-compose.yml up -d
 ```
-Now visit [http://localhost:3000](http://localhost:3000) in your browser. 
 
-## Demo Video
-<a href="http://www.youtube.com/watch?feature=player_embedded&v=VZvgNa65GQU" target="_blank">
- <img src="http://img.youtube.com/vi/VZvgNa65GQU/hqdefault.jpg" alt="Watch CueObserve video"/>
-</a>
+Now visit [http://localhost:3000](http://localhost:3000) in your browser.
+
+### Demo Video
+
+[![Watch CueObserve video](http://img.youtube.com/vi/VZvgNa65GQU/hqdefault.jpg)](http://www.youtube.com/watch?feature=player\_embedded\&v=VZvgNa65GQU)
+
+### How it works
 
-## How it works
 You write a SQL GROUP BY query, map its columns as dimensions and measures, and save it as a virtual Dataset.
 
-![Dataset SQL](docs/images/Dataset_SQL_cropped.png)
+![Dataset SQL](<.gitbook/assets/Dataset\_SQL\_cropped (1).png>)
 
-![Dataset Schema Map](docs/images/Dataset_Mapping_cropped.png)
+![Dataset Schema Map](<.gitbook/assets/Dataset\_Mapping\_cropped (1).png>)
 
 You then define one or more anomaly detection jobs on the dataset.
 
-![Anomaly Definition](docs/images/AnomalyDefinitions.png)
+![Anomaly Definition](<.gitbook/assets/AnomalyDefinitions (1).png>)
 
 When an anomaly detection job runs, CueObserve does the following:
+
 1. Executes the SQL GROUP BY query on your data warehouse and stores the result as a Pandas dataframe.
 2. Generates one or more timeseries from the dataframe, as defined in your anomaly detection job.
 3. Generates a forecast for each timeseries using [Prophet](https://github.com/facebook/prophet).
 4. Creates a visual card for each timeseries. Marks the card as an anomaly if the last data point is anomalous.
 
-## Features
-- Automated SQL to timeseries transformation.
-- Run anomaly detection on the aggregate metric or split it by any dimension. Limit the split to significant dimension values.
-- Use Prophet or simple mathematical rules to detect anomalies.
-- In-built Scheduler. CueObserve uses Celery as the executor and celery-beat as the scheduler.
-- Slack alerts when anomalies are detected.
-- Monitoring. Slack alert when a job fails. CueObserve maintains detailed logs.
+### Features
+
+* Automated SQL to timeseries transformation.
+* Run anomaly detection on the aggregate metric or split it by any dimension. Limit the split to significant dimension values.
+* Use Prophet or simple mathematical rules to detect anomalies.
+* In-built Scheduler. CueObserve uses Celery as the executor and celery-beat as the scheduler.
+* Slack alerts when anomalies are detected.
+* Monitoring. Slack alert when a job fails. CueObserve maintains detailed logs.
 
-### Limitations
-- Currently supports Prophet for timeseries forecasting.
-- Not being built for real-time anomaly detection on streaming data.
+#### Limitations
 
-## Support
-For general help using CueObserve, read the [documentation](https://cueobserve.cuebook.ai/), or go to [Github Discussions](https://github.com/cuebook/cueobserve/discussions).
+* Currently supports Prophet for timeseries forecasting.
+* Not being built for real-time anomaly detection on streaming data.
+
+### Support
+
+For general help using CueObserve, read the [documentation](https://cueobserve.cuebook.ai), or go to [Github Discussions](https://github.com/cuebook/cueobserve/discussions).
 
 To report a bug or request a feature, open an [issue](https://github.com/cuebook/cueobserve/issues).
 
-## Contributing
-We'd love contributions to CueObserve. Before you contribute, please first discuss the change you wish to make via an [issue](https://github.com/cuebook/cueobserve/issues) or a [discussion](https://github.com/cuebook/cueobserve/discussions). Contributors are expected to adhere to our [code of conduct](https://github.com/cuebook/cueobserve/blob/main/CODE_OF_CONDUCT.md).
+### Contributing
+
+We'd love contributions to CueObserve. Before you contribute, please first discuss the change you wish to make via an [issue](https://github.com/cuebook/cueobserve/issues) or a [discussion](https://github.com/cuebook/cueobserve/discussions). Contributors are expected to adhere to our [code of conduct](https://github.com/cuebook/cueobserve/blob/main/CODE\_OF\_CONDUCT.md).
@@ -2,14 +2,14 @@
 
 Anomalies screen lists all published anomalies. Click on a row to view its anomaly card.
 
-![](.gitbook/assets/anomalies.png)
+![](.gitbook/assets/Anomalies.png)
 
 Daily anomalies automatically unpublish if there's no anomaly for the next 5 days. Hourly anomalies unpublish after 1 day.
 
 ## Anomaly Cards
 
 Anomaly cards follow a template. If you want, you can modify the templates.
 
-![Hourly Anomaly card](.gitbook/assets/anomalycard_hourly_cropped.png)
+![Hourly Anomaly card](.gitbook/assets/AnomalyCard\_Hourly\_cropped.png)
 
-![Daily Anomaly card](.gitbook/assets/anomalycard_daily_cropped.png)
+![Daily Anomaly card](.gitbook/assets/AnomalyCard\_Daily\_cropped.png)
@@ -2,14 +2,14 @@
 
 You can define one or more anomaly detection jobs on a dataset. The anomaly detection job can monitor a measure at an aggregate level or split the measure by a dimension.
 
-To define an anomaly job, you 
+To define an anomaly job, you&#x20;
 
 1. Select a dataset
 2. Select a measure from the dataset
 3. Select a dimension to split the measure _(optional)_
 4. Select an anomaly rule
 
-![](.gitbook/assets/anomalydefinitions.png)
+![](.gitbook/assets/AnomalyDefinitions.png)
 
 ## Split Measure by Dimension
 
@@ -19,7 +19,7 @@ To split a measure by a dimension, select the dimension and then limit the numbe
 
 Choose the optional **High/Low** to detect only one type of anomalies. Choose **High** for an increase in measure or **Low** for a drop in measure.
 
-![](.gitbook/assets/anomalydefinition_cuel.gif)
+![](.gitbook/assets/AnomalyDefinition\_CueL.gif)
 
 ### Limit Dimension Values
 
@@ -31,21 +31,21 @@ Top N limits the number of dimension values based on the dimension value's contr
 
 Say you want to monitor Orders measure. But you want to monitor it for your top 10 states only. You would then define anomaly something like below:
 
-![](.gitbook/assets/topn.png)
+![](.gitbook/assets/TopN.png)
 
 #### Min % Contribution
 
 Minimum % Contribution limits the number of dimension values based on the dimension value's contribution to the measure.
 
 Say you want to monitor Orders measure for every state that contributed at least 2% to the total Orders, your anomaly definition would look something like below:
 
-![](.gitbook/assets/mincontribution.png)
+![](.gitbook/assets/MinContribution.png)
 
 #### Min Avg Value
 
 Minimum Average Value limits the number of dimension values based on the measure's average value.
 
-![](.gitbook/assets/minavgvalue.png)
+![](.gitbook/assets/MinAvgValue.png)
 
 In the example above, only states where _average(Orders) >= 10_ will be selected. If your granularity is daily, this means daily average orders. If your granularity is hourly, this means hourly average orders.
 
@@ -64,7 +64,7 @@ This algorithm uses the open-source [Prophet](https://github.com/facebook/prophe
 
 The metric's percentage deviation (_45% in the image below_) is calculated with respect to the threshold of the forecast's confidence range.
 
-![](.gitbook/assets/anomalydeviation.png)
+![](.gitbook/assets/AnomalyDeviation.png)
 
 ### Percentage Change
 
 
@@ -2,15 +2,26 @@
 
 Datasets are similar to aggregated SQL VIEWS of your data. When you run an anomaly detection job, the associated dataset's SQL query is run and the results are stored as a Pandas dataframe in memory.
 
-![](.gitbook/assets/dataset_sql.png)
+![](.gitbook/assets/Dataset\_SQL.png)
 
 You write a SQL GROUP BY query with aggregate functions to roll-up your data. You then map the columns as dimensions or measures.
 
-![](.gitbook/assets/dataset_mapping_cropped.png)
+![](.gitbook/assets/Dataset\_Mapping\_cropped.png)
 
 1. Dataset must have only one timestamp column. This timestamp column is used to generate timeseries data for anomaly detection.
 2. Dataset must have at least one aggregate column. CueObserve currently supports only COUNT or SUM as aggregate functions. Aggregate columns must be mapped as measures.
 3. Dataset can have one or more dimension columns (optional).
+4. Dataset can be classified as a non-rollup dataset, details are provided below.
+
+### **Non-Rollup Datasets**
+
+A dataset can be created as a non-rollup dataset using a switch to inform the system that it does not need to roll up aggregate the data during the pre-processing of the data.
+
+![Non Roll-up switch](.gitbook/assets/new.png)
+
+By default, all datasets are "rolled up" i.e. metric data points are aggregated(summed up) on the timestamp buckets for a specific dimension value.
+
+But for metrics like percentage etc. such aggregation might not be relevant, so one can specify to the system that it is a non-rollup dataset. Currently we support only single dimension on Non-rollup datasets to avoid duplicate timestamp values after pre-processing.
 
 ## SQL GROUP BY Query
 
 
@@ -12,7 +12,7 @@ description: >-
 CueObserve has multi-service architecture, with services as mentioned:
 
 1. `Frontend` single-page application written on [ReactJS](https://reactjs.org). It's code can be found in `ui` folder and runs on [http://localhost:3000/](https://reactjs.org).
-2. `API` is based on [Django](https://www.djangoproject.com) (python framework) & uses REST API. It is the main service, responsible for connections, authentication and anomaly. 
+2. `API` is based on [Django](https://www.djangoproject.com) (python framework) & uses REST API. It is the main service, responsible for connections, authentication and anomaly.&#x20;
 3. `Alerts` micro-service, currently responsible for sending alerting/notifications only to slack. It's code is in `alerts-api` folder and runs on [localhost:8100](http://localhost:8100).
 4. [Celery](https://docs.celeryproject.org) to execute the tasks asynchronously. Tasks like anomaly detection are handled by Celery.
 5. [Celery beat](https://docs.celeryproject.org/en/stable/userguide/periodic-tasks.html) scheduler to trigger the scheduled tasks.
@@ -25,14 +25,14 @@ Get the code by cloning our open source [github repo](https://github.com/cuebook
 ```
 git clone https://github.com/cuebook/CueObserve.git
 cd CueObserve
-docker-compose -f docker-compose-dev.yml --env-file .env up --build 
+docker-compose -f docker-compose-dev.yml --env-file .env.dev up --build 
 ```
 
 `docker-compose`'s build command will pull several components and install them on local, so this will take a few minutes to complete.
 
 ### Backend Development
 
-The code for the backend is in `/api` directory. As mentioned in the overview it is based on Django framework. 
+The code for the backend is in `/api` directory. As mentioned in the overview it is based on Django framework.&#x20;
 
 #### Configure environment variables
 
@@ -57,17 +57,17 @@ export DJANGO_SUPERUSER_EMAIL="[email protected]"
 export `=False 
 ```
 
-Change the values based on your running PostgreSQL instance. If you do not wish to use PostgreSQL as your database for development, comment lines 4-8 and CueObserve will create a SQLite database file at the location `api/db/db.sqlite3`. 
+Change the values based on your running PostgreSQL instance. If you do not wish to use PostgreSQL as your database for development, comment lines 4-8 and CueObserve will create a SQLite database file at the location `api/db/db.sqlite3`.&#x20;
 
-The backend server can be accessed on [http://localhost:8000/](https://www.djangoproject.com). 
+The backend server can be accessed on [http://localhost:8000/](https://www.djangoproject.com).&#x20;
 
-#### Celery Development 
+#### Celery Development&#x20;
 
-CueObserve uses Celery for executing asynchronous tasks like anomaly detection. There are three components needed to run an asynchronous task, i.e. Redis, Celery and Celery Beat. Redis is used as the message queue by Celery, so before starting Celery services, Redis server should be running. Celery Beat is used as the scheduler and is responsible to trigger the scheduled tasks. Celery workers are used to execute the tasks. 
+CueObserve uses Celery for executing asynchronous tasks like anomaly detection. There are three components needed to run an asynchronous task, i.e. Redis, Celery and Celery Beat. Redis is used as the message queue by Celery, so before starting Celery services, Redis server should be running. Celery Beat is used as the scheduler and is responsible to trigger the scheduled tasks. Celery workers are used to execute the tasks.&#x20;
 
 ### Testing
 
-At the moment, we have test cases only for the backend service, test cases for UI are in our roadmap. 
+At the moment, we have test cases only for the backend service, test cases for UI are in our roadmap.&#x20;
 
 Backend for API and services is tested using [PyTest](https://docs.pytest.org/en/6.2.x/). To run test cases `exec` into cueo-backend and run command
 
 
@@ -3,35 +3,23 @@
 ## Install via Docker-Compose
 
 ```
-wget https://raw.githubusercontent.com/cuebook/CueObserve/latest_release/docker-compose.yml -q -O cueobserve-docker-compose.yml
-docker-compose -f cueobserve-docker-compose.yml up -d
+mkdir -p ~/cuebook
+wget https://raw.githubusercontent.com/cuebook/CueObserve/latest_release/docker-compose-prod.yml -q -O ~/cuebook/docker-compose-prod.yml
+wget https://raw.githubusercontent.com/cuebook/CueObserve/latest_release/.env -q -O ~/cuebook/.env
+cd ~/cuebook
 ```
 
-**Development Mode:**
-
-```
-docker-compose -f docker-compose-dev.yml up -d
-```
-
-**OR Production Mode:**
-
-```
-docker-compose up -d
-```
-
-**OR** Install via Docker **(Deprecated Method)**
-
 ```
-docker run -p 3000:3000 cuebook/cueobserve
+docker-compose -f docker-compose-prod.yml --env-file .env up -d
 ```
 
-Now visit [localhost:3000](http://localhost:3000) in your browser. 
+Now visit [localhost:3000](http://localhost:3000) in your browser.&#x20;
 
 ## Add Connection
 
 Go to the Connections screen to create a connection.
 
-![](<.gitbook/assets/addconnection (1).png>)
+![](<.gitbook/assets/AddConnection (1).png>)
 
 ## Add Dataset
 
@@ -43,6 +31,6 @@ Create an anomaly detection job on your dataset. See [Anomaly Definitions](anoma
 
 Once you have created an anomaly job, click on the \`Run\` icon button to trigger the anomaly job. It might take a few seconds for the job to execute.
 
-![](.gitbook/assets/anomalydefinitions.png)
+![](.gitbook/assets/AnomalyDefinitions.png)
 
 Once the job is successful, go to the Anomalies screen to view your anomalies.