Merge pull request #214 from cuebook/master

Merging updated gitbook documentation
cuebook · Dec 23, 2021 · a305178 · a305178
2 parents eb995e4 + ba28e1e
commit a305178
Show file tree

Hide file tree

Showing 11 changed files with 81 additions and 168 deletions.
diff --git a/anomalies.md b/anomalies.md
@@ -13,4 +13,3 @@ Anomaly cards follow a template. If you want, you can modify the templates.
 ![Hourly Anomaly card](.gitbook/assets/anomalycard_hourly_cropped.png)
 
 ![Daily Anomaly card](.gitbook/assets/anomalycard_daily_cropped.png)
-
diff --git a/anomaly-definitions.md b/anomaly-definitions.md
@@ -6,14 +6,14 @@ To define an anomaly job, you
 
 1. Select a dataset
 2. Select a measure from the dataset
-3. Select a dimension to split the measure _\(optional\)_
+3. Select a dimension to split the measure _(optional)_
 4. Select an anomaly rule
 
 ![](.gitbook/assets/anomalydefinitions.png)
 
 ## Split Measure by Dimension
 
-`Measure` \[`Dimension` `Limit` \] \[`High/Low`\]
+`Measure` \[`Dimension` `Limit` ] \[`High/Low`]
 
 To split a measure by a dimension, select the dimension and then limit the number of unique dimension values you want to split into.
 
@@ -47,7 +47,7 @@ Minimum Average Value limits the number of dimension values based on the measure
 
 ![](.gitbook/assets/minavgvalue.png)
 
-In the example above, only states where _average\(Orders\) &gt;= 10_ will be selected. If your granularity is daily, this means daily average orders. If your granularity is hourly, this means hourly average orders.
+In the example above, only states where _average(Orders) >= 10_ will be selected. If your granularity is daily, this means daily average orders. If your granularity is hourly, this means hourly average orders.
 
 ## Anomaly Detection Algorithms
 
@@ -60,9 +60,9 @@ CueObserve offers the following algorithms for anomaly detection.
 
 ### Prophet
 
-This algorithm uses the open-source [Prophet](https://github.com/facebook/prophet) procedure to generate a forecast for the timeseries. It then compares the actual value with the forecasted value. If the actual value is outside the forecast's confidence range \(_grey band in the image below_\), it marks the actual value as an anomalous data point.
+This algorithm uses the open-source [Prophet](https://github.com/facebook/prophet) procedure to generate a forecast for the timeseries. It then compares the actual value with the forecasted value. If the actual value is outside the forecast's confidence range (_grey band in the image below_), it marks the actual value as an anomalous data point.
 
-The metric's percentage deviation \(_45% in the image below_\) is calculated with respect to the threshold of the forecast's confidence range.
+The metric's percentage deviation (_45% in the image below_) is calculated with respect to the threshold of the forecast's confidence range.
 
 ![](.gitbook/assets/anomalydeviation.png)
 
@@ -84,7 +84,5 @@ _Anomaly when Value greater than `X`_
 
 _Anomaly when Value not between `X` and `Y`_
 
-\_\_
-
-
+__
 
diff --git a/anomaly-detection.md b/anomaly-detection.md
@@ -29,4 +29,3 @@ Next CueObserve combines the actual data with the forecasted data from Prophet a
 CueObserve saves the actual data with the bands and the forecast in its database. If the latest anomalous data point is not older than a certain time threshold, CueObserve publishes it as an anomaly and saves the dimension value and its contribution. The aforementioned time threshold depends on the granularity. It is 5 days if the granularity is daily and 1 day if the granularity is hourly.
 
 Finally, CueObserve stores all the individual results of the process along with the metadata in a format for easy visual representation in the UI.
-
diff --git a/datasets.md b/datasets.md
@@ -10,7 +10,7 @@ You write a SQL GROUP BY query with aggregate functions to roll-up your data. Yo
 
 1. Dataset must have only one timestamp column. This timestamp column is used to generate timeseries data for anomaly detection.
 2. Dataset must have at least one aggregate column. CueObserve currently supports only COUNT or SUM as aggregate functions. Aggregate columns must be mapped as measures.
-3. Dataset can have one or more dimension columns \(optional\).
+3. Dataset can have one or more dimension columns (optional).
 
 ## SQL GROUP BY Query
 
@@ -30,4 +30,3 @@ ORDER BY 1
 ```
 
 Since the last time bucket might be partial, CueObserve ignores the last time bucket when generating timeseries.
-
diff --git a/development.md b/development.md
@@ -9,64 +9,34 @@ description: >-
 
 ### Overview
 
-CueObserve has 5 basic components:
+CueObserve has multi-service architecture, with services as mentioned:
 
-1. Frontend single-page application written on [ReactJS](https://reactjs.org/).
-2. Backend based on [Django](https://www.djangoproject.com/) \(python framework\), which is responsible for the communication with the frontend application via REST APIs.
-3. [Celery](https://docs.celeryproject.org/) to execute the tasks asynchronously. Tasks like anomaly detection are handled by Celery.
-4. [Celery beat](https://docs.celeryproject.org/en/stable/userguide/periodic-tasks.html) scheduler to trigger the scheduled tasks.
-5. [Redis](https://redis.io/documentation) to handle the task queue of Celery.
+1. `Frontend` single-page application written on [ReactJS](https://reactjs.org). It's code can be found in `ui` folder and runs on [http://localhost:3000/](https://reactjs.org).
+2. `API` is based on [Django](https://www.djangoproject.com) (python framework) & uses REST API. It is the main service, responsible for connections, authentication and anomaly. 
+3. `Alerts` micro-service, currently responsible for sending alerting/notifications only to slack. It's code is in `alerts-api` folder and runs on [localhost:8100](http://localhost:8100).
+4. [Celery](https://docs.celeryproject.org) to execute the tasks asynchronously. Tasks like anomaly detection are handled by Celery.
+5. [Celery beat](https://docs.celeryproject.org/en/stable/userguide/periodic-tasks.html) scheduler to trigger the scheduled tasks.
+6. [Redis](https://redis.io/documentation) to handle the task queue of Celery.
 
-### Getting code
+### Getting code & starting development servers
 
 Get the code by cloning our open source [github repo](https://github.com/cuebook/cueobserve)
 
-```text
+```
 git clone https://github.com/cuebook/CueObserve.git
 cd CueObserve
+docker-compose -f docker-compose-dev.yml --env-file .env up --build 
 ```
 
-### Frontend Development 
-
-The code for frontend is in `/ui` directory. CueObserve uses `npm` as the package manager. 
-
-**Prerequisites:**
-
-1. Node &gt;= 12
-2. npm &gt;= 6
-
-```bash
-cd ui
-npm install    # install dependencies
-npm start      # start development server
-```
-
-This starts the frontend server on [http://localhost:3000/](https://reactjs.org/)
+`docker-compose`'s build command will pull several components and install them on local, so this will take a few minutes to complete.
 
 ### Backend Development
 
 The code for the backend is in `/api` directory. As mentioned in the overview it is based on Django framework. 
 
-**Prerequisite:** 
-
-1. Python 3.7
-2. PostgreSQL Server running locally or on server \(Optional\)
-
-#### Setup Virtual Environment & Install Dependencies
-
-Setting up a virtual environment is necessary to have your python libraries for this project stored separately so that there is no conflict with other projects. 
-
-```bash
-cd api
-python3 -m virtualenv myenv         # Create Python3 virtual environment
-source myenv/bin/activate           # Activate virtual environment
-
-pip install -r requirements.txt     # Install project dependencies
-```
-
 #### Configure environment variables
 
-The environment variables required to run the backend server can be found in `api/.env.dev`. The file looks like below:
+Configure environment variables as you need for the backend server :
 
 ```bash
 export ENVIRONMENT=dev
@@ -84,97 +54,23 @@ export DJANGO_SUPERUSER_PASSWORD="admin"
 export DJANGO_SUPERUSER_EMAIL="[email protected]"
 
 ## AUTHENTICATION
-export IS_AUTHENTICATION_REQUIRED=False 
+export `=False 
 ```
 
 Change the values based on your running PostgreSQL instance. If you do not wish to use PostgreSQL as your database for development, comment lines 4-8 and CueObserve will create a SQLite database file at the location `api/db/db.sqlite3`. 
 
-After changing the values, source the file to initialize all the environment variables. 
-
-```text
-source .env.dev
-```
-
-Then run the following commands to migrate the schema to your database and load static data required by CueObserve:
-
-```bash
-python manage.py migrate                     # Migrate db schema
-python manage.py loaddata seeddata/*.json    # Load seed data in database
-```
-
-After the above steps are completed successfully, we can start our backend server by running:
-
-```text
-python manage.py runserver
-```
-
-This starts the backend server on [http://localhost:8000/](https://reactjs.org/). 
+The backend server can be accessed on [http://localhost:8000/](https://www.djangoproject.com). 
 
 #### Celery Development 
 
-CueObserve uses Celery for executing asynchronous  tasks like anomaly detection. There are three components needed to run an asynchronous task, i.e. Redis, Celery and Celery Beat. Redis is used as the message queue by Celery, so before starting Celery services, Redis server should be running. Celery Beat is used as the scheduler and is responsible to trigger the scheduled tasks. Celery workers are used to execute the tasks. 
-
-**Starting Redis Server**
-
-Redis server can be easily started by its official docker image.
-
-```bash
-docker run -dp 6379:6379 redis    # Run redis docker on port 6379
-```
-
-#### Start Celery Beat
-
-To start celery beat service, activate the virtual environment created for the backend server and then source the .env.dev file to export all required environment variables.
-
-```bash
-cd api
-source myenv/bin/activate           # Activate virtual environment
-source .env.dev                     # Export environment variables.
-celery -A app beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler --detach         # Run celery beat service
-```
-
-#### Start Celery 
-
-To start the celery service, its same as backend or celery beat, first activate the virual env created and then source .env.dev file to export all required environment variables. Celery service doesn't reloads on code changes so we have to install some additional libraries to make it happen. 
-
-```text
-cd api
-source myenv/bin/activate           # Activate virtual environment
-source .env.dev                     # Export environment variables
-
-pip install watchdog pyyaml argh    # Additional libraries to reload celery on code changes
-watchmedo auto-restart -- celery -A app worker -l info --purge      # Run celery
-```
-
-After these three services are running, you can trigger a task or wait for a scheduled task to run. 
-
-### Building Docker Image
-
-To build the docker image, run the following command in root directory:
-
-```text
-docker build -t <YOUR_TAG_NAME> .
-```
-
-To run the built image exposed on port 3000:
-
-```text
-docker run -dp 3000:3000 <YOUR_TAG_NAME>
-```
+CueObserve uses Celery for executing asynchronous tasks like anomaly detection. There are three components needed to run an asynchronous task, i.e. Redis, Celery and Celery Beat. Redis is used as the message queue by Celery, so before starting Celery services, Redis server should be running. Celery Beat is used as the scheduler and is responsible to trigger the scheduled tasks. Celery workers are used to execute the tasks. 
 
 ### Testing
 
 At the moment, we have test cases only for the backend service, test cases for UI are in our roadmap. 
 
-Backend test environment is light and doesn't depend on services like Redis, Celery or Celery-Beat, they are mocked instead. Backend for API and services is tested using [PyTest](https://docs.pytest.org/en/6.2.x/).
-
- To run the test cases virtual environment should be activated and then source .env.dev file to export all required environment variables. 
+Backend for API and services is tested using [PyTest](https://docs.pytest.org/en/6.2.x/). To run test cases `exec` into cueo-backend and run command
 
-```text
-cd api
-source myenv/bin/activate           # Activate virtual environment
-source .env.dev                     # Export environment variables
-
-pytest                              # Run tests
 ```
-
+pytest
+```
diff --git a/getting-started.md b/getting-started.md
@@ -2,21 +2,26 @@
 
 ## Install via Docker-Compose
 
+```
+wget https://raw.githubusercontent.com/cuebook/CueObserve/latest_release/docker-compose.yml -q -O cueobserve-docker-compose.yml
+docker-compose -f cueobserve-docker-compose.yml up -d
+```
+
 **Development Mode:**
 
-```text
+```
 docker-compose -f docker-compose-dev.yml up -d
 ```
 
 **OR Production Mode:**
 
-```text
+```
 docker-compose up -d
 ```
 
-**OR** Install via Docker **\(Deprecated Method\)**
+**OR** Install via Docker **(Deprecated Method)**
 
-```text
+```
 docker run -p 3000:3000 cuebook/cueobserve
 ```
 
@@ -26,7 +31,7 @@ Now visit [localhost:3000](http://localhost:3000) in your browser.
 
 Go to the Connections screen to create a connection.
 
-![](.gitbook/assets/addconnection%20%281%29.png)
+![](<.gitbook/assets/addconnection (1).png>)
 
 ## Add Dataset
 
@@ -41,4 +46,3 @@ Once you have created an anomaly job, click on the \`Run\` icon button to trigge
 ![](.gitbook/assets/anomalydefinitions.png)
 
 Once the job is successful, go to the Anomalies screen to view your anomalies.
-
diff --git a/installation.md b/installation.md
@@ -2,72 +2,66 @@
 
 ## Install via Docker
 
-```text
-docker run -p 3000:3000 cuebook/cueobserve
+```
+wget https://raw.githubusercontent.com/cuebook/CueObserve/latest_release/docker-compose.yml -q -O cueobserve-docker-compose.yml
+docker-compose -f cueobserve-docker-compose.yml up -d
 ```
 
 Now visit [localhost:3000](http://localhost:3000) in your browser. 
 
-By default, CueObserve uses sqlite as its database \(not recommended for production use, please refer below to use Postgres as the database for CueObserve\). If you want data to persist across runs, specify a local folder location \(as below\) where db.sqlite3 file can be stored.
-
-```text
-docker run -v <local folder location>:/code/db -p 3000:3000 cuebook/cueobserve
-```
-
 ## Use Postgres as the application database
 
 SQLite is the default storage database for CueObserve. However, it might not be suitable for production. To use Postgres instead, do the following:
 
 Create a `.env` file with given variables:
 
-```text
+```
 POSTGRES_DB_SCHEMA=cueobserve
 POSTGRES_DB_USERNAME=postgres
 POSTGRES_DB_PASSWORD=postgres
 POSTGRES_DB_HOST=localhost
 POSTGRES_DB_PORT=5432
 ```
 
-```text
-docker run --env-file .env -dp 3000:3000 cuebook/cueobserve
 ```
-
-In case your Postgres is hosted locally, pass the flag `--network="host"` to connect docker to the localhost of the machine.
+wget https://raw.githubusercontent.com/cuebook/CueObserve/latest_release/docker-compose.yml -q -O cueobserve-docker-compose.yml
+docker-compose --env-file .env -f cueobserve-docker-compose.yml up -d
+```
 
 ## Authentication
 
-CueObserve comes with built-in authentication \(powered by Django\). By default authentication is disabled, to enable authentication create a `.env` file with the given variables or add these variables in the already created `.env` file with Postgres credentials.
+CueObserve comes with built-in authentication (powered by Django). By default authentication is disabled, to enable authentication create a `.env` file with the given variables or add these variables in the already created `.env` file with Postgres credentials.
 
-```text
+```
 DJANGO_SUPERUSER_USERNAME=<USER_NAME>
 DJANGO_SUPERUSER_PASSWORD=<PASSWORD>
 DJANGO_SUPERUSER_EMAIL=<[email protected]>
 IS_AUTHENTICATION_REQUIRED=True
 ```
 
-```text
-docker run --env-file .env -dp 3000:3000 cuebook/cueobserve
+```
+wget https://raw.githubusercontent.com/cuebook/CueObserve/latest_release/docker-compose.yml -q -O cueobserve-docker-compose.yml
+docker-compose --env-file .env -f cueobserve-docker-compose.yml up -d
 ```
 
 If authentication is enabled you can access the [Django Admin](https://docs.djangoproject.com/en/3.2/ref/contrib/admin/) console to do the database operations with a nice UI. To access Django Admin go to [http://localhost:3000/admin](http://localhost:3000/admin) and enter the username and password provided in the `.env` file.
 
 ## Email Notification
 
-CueObserve comes with built-in email alert notification system\(powered by Django\). By default email notifications are disabled, to enable notifications create a `.env` file with the given variables or add these variables in the already created `.env` file.
+CueObserve comes with built-in email alert notification system(powered by Django). By default email notifications are disabled, to enable notifications create a `.env` file with the given variables or add these variables in the already created `.env` file.
 
-```text
+```
 EMAIL_HOST="smtp.gmail.com" 
 EMAIL_HOST_USER=<[email protected]>
 EMAIL_HOST_PASSWORD=<YOUR_EMAIL_PASSWORD>
 ```
 
-Allow less secure apps: ON for your given EMAIL\_HOST\_USER email Id, click on [enable access to less secure app](https://myaccount.google.com/lesssecureapps?pli=1&rapt=AEjHL4N7wse3vhCsvRv-aWy8kKeEGDZS2YDbW1SfTL17HVhtemi7zZW5gzbZSBnrNgknL_gPBDn3xVo0qUj-W6NuaYTSU7agQQ)
+Allow less secure apps: ON for your given EMAIL_HOST_USER email Id, click on [enable access to less secure app](https://myaccount.google.com/lesssecureapps?pli=1\&rapt=AEjHL4N7wse3vhCsvRv-aWy8kKeEGDZS2YDbW1SfTL17HVhtemi7zZW5gzbZSBnrNgknL_gPBDn3xVo0qUj-W6NuaYTSU7agQQ)
 
 Unlock Captcha for your gmail account, click on [Unlock Captcha](https://accounts.google.com/b/0/UnlockCaptcha)
 
 
 
 ## Infra Requirements
 
-The minimum infrastructure requirement for CueObserve is _1 GB RAM/ 1 CPU_. If Multiple CPUs\(cores\) are provided, they can be utilized by tasks like Anomaly Detection & Root Cause Analysis for faster processing.
-
+The minimum infrastructure requirement for CueObserve is _1 GB RAM/ 1 CPU_. If Multiple CPUs(cores) are provided, they can be utilized by tasks like Anomaly Detection & Root Cause Analysis for faster processing.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -13,4 +13,3 @@ Anomaly cards follow a template. If you want, you can modify the templates.
		![Hourly Anomaly card](.gitbook/assets/anomalycard_hourly_cropped.png)

		![Daily Anomaly card](.gitbook/assets/anomalycard_daily_cropped.png)
Original file line number	Diff line number	Diff line change
Expand Up		@@ -29,4 +29,3 @@ Next CueObserve combines the actual data with the forecasted data from Prophet a
		CueObserve saves the actual data with the bands and the forecast in its database. If the latest anomalous data point is not older than a certain time threshold, CueObserve publishes it as an anomaly and saves the dimension value and its contribution. The aforementioned time threshold depends on the granularity. It is 5 days if the granularity is daily and 1 day if the granularity is hourly.

		Finally, CueObserve stores all the individual results of the process along with the metadata in a format for easy visual representation in the UI.