Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elk analyser #66

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 104 additions & 0 deletions elk-analyser/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Introduction

This tool is aimed to help Yugabye Support team to investigate customer's server log in a offline environment.

To achieve such goal, this tool utilize Filebeat to automatically upload the log files to a pre-built ELK cluster and enrich the log messages with pre-defined pipelines so that support engineer can easily read and search the log.

# ELK Setup

Currently the tool only supports local docker ELK setup, please refer to the [official ELK docker setup guide](https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html) for reference.

Please also edit the `.env` and `docker-compose.yml` to meet your environment.

# First pipeline for tserver logs

Once the ELK cluster is up and running, Go to `Kibana UI` > `Dev Tools` and create the pipeline for tserver/master log pattern as below (Replace the pipeline name as your wish):

```
PUT _ingest/pipeline/<tserver_log>
{
"description": "testing grok processor",
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"(?<loglevel>[IWEF])(?<timestamp>%{MONTHNUM}%{MONTHDAY} %{TIME})%{SPACE}%{NUMBER:threadid} %{NOTSPACE:filename}:%{NUMBER:fileline}%{NOTSPACE} %{GREEDYDATA:msg}"
]
}
},
{
"date": {
"field": "timestamp",
"formats": [
"MMdd HH:mm:ss.SSSSSS"
],
"target_field": "@timestamp"
}
},
{
"grok": {
"field": "log.file.path",
"patterns": [
"^%{GREEDYDATA}/case/%{WORD:case_id}/%{GREEDYDATA}"
],
"on_failure": [
{
"append": {
"field": "case_id",
"value": [
"0"
]
}
}
]
}
},
{
"remove": {
"field": [
"timestamp",
"log",
"agent",
"ecs",
"host"
]
}
}
]
}
```

Application log pipeline is still WIP and will be updated once done.

# Filebeat Setup

Download and install Filebeat in your local environment following the [official guide](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation-configuration.html).

Edit the `filebeat.yml.sample` file in the same folder, modify the `filebeat.inputs` and `output.elasticsearch` to match your environment.

Once done, rename the config file to `filebeat.yml` and run `filebeat -e -c filebeat.yml` to start the log consumption.
Note that depends on your platform the command can be slightly different, refer to the [official filebeat setup and run guide](https://www.elastic.co/guide/en/beats/filebeat/current/setting-up-and-running.html) for more details.

# Investigation

Go to `Kibana UI` > `Discover` and create a new View using `filebeat*` as `Index Pattern` and `@timestamp` as `Timestamp field`.

Now you can see the logs in this view.

# Rules & Alerts

It is often not clear what kind of issue is happening in customer's environment when we first start to investigate the issue.
With the help of ELK's rules and alert function, it is possible to automatically triage the logs with predefined rules and generate suggestions based on existing KBs.

To enable the rules & alerts, it is required to create an encryption key for kibana first.

Refer to the official [alert setup guide](https://www.elastic.co/guide/en/kibana/current/alerting-setup.html#alerting-prerequisites) and also .kibana.yml for more details.

# WIP

The next step of this tools is to create more pipelines to support master/application log pattern and also combine with Lincoln so that any tserver/master/application log uploaded to the Zendesk can be automatically parsed and kept in a centralized place.

Another goal of this tools is to create more alerts so that it will automatically generate notifications and suggest possible solutions (KB articles) based on parsed logs.

Also, the ELK setup section should also support standalone VM or K8S based installation.
231 changes: 231 additions & 0 deletions elk-analyser/docker-compose.yml.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
version: "2.2"

services:
setup:
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
volumes:
- certs:/usr/share/elasticsearch/config/certs
user: "0"
command: >
bash -c '
if [ x${ELASTIC_PASSWORD} == x ]; then
echo "Set the ELASTIC_PASSWORD environment variable in the .env file";
exit 1;
elif [ x${KIBANA_PASSWORD} == x ]; then
echo "Set the KIBANA_PASSWORD environment variable in the .env file";
exit 1;
fi;
if [ ! -f config/certs/ca.zip ]; then
echo "Creating CA";
bin/elasticsearch-certutil ca --silent --pem -out config/certs/ca.zip;
unzip config/certs/ca.zip -d config/certs;
fi;
if [ ! -f config/certs/certs.zip ]; then
echo "Creating certs";
echo -ne \
"instances:\n"\
" - name: es01\n"\
" dns:\n"\
" - es01\n"\
" - localhost\n"\
" ip:\n"\
" - 127.0.0.1\n"\
" - name: es02\n"\
" dns:\n"\
" - es02\n"\
" - localhost\n"\
" ip:\n"\
" - 127.0.0.1\n"\
" - name: es03\n"\
" dns:\n"\
" - es03\n"\
" - localhost\n"\
" ip:\n"\
" - 127.0.0.1\n"\
> config/certs/instances.yml;
bin/elasticsearch-certutil cert --silent --pem -out config/certs/certs.zip --in config/certs/instances.yml --ca-cert config/certs/ca/ca.crt --ca-key config/certs/ca/ca.key;
unzip config/certs/certs.zip -d config/certs;
fi;
echo "Setting file permissions"
chown -R root:root config/certs;
find . -type d -exec chmod 750 \{\} \;;
find . -type f -exec chmod 640 \{\} \;;
echo "Waiting for Elasticsearch availability";
until curl -s --cacert config/certs/ca/ca.crt https://es01:9200 | grep -q "missing authentication credentials"; do sleep 30; done;
echo "Setting kibana_system password";
until curl -s -X POST --cacert config/certs/ca/ca.crt -u "elastic:${ELASTIC_PASSWORD}" -H "Content-Type: application/json" https://es01:9200/_security/user/kibana_system/_password -d "{\"password\":\"${KIBANA_PASSWORD}\"}" | grep -q "^{}"; do sleep 10; done;
echo "All done!";
'
healthcheck:
test: ["CMD-SHELL", "[ -f config/certs/es01/es01.crt ]"]
interval: 1s
timeout: 5s
retries: 120

es01:
depends_on:
setup:
condition: service_healthy
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
volumes:
- certs:/usr/share/elasticsearch/config/certs
- esdata01:/usr/share/elasticsearch/data
ports:
- ${ES_PORT}:9200
environment:
- node.name=es01
- cluster.name=${CLUSTER_NAME}
- cluster.initial_master_nodes=es01,es02,es03
- discovery.seed_hosts=es02,es03
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
- bootstrap.memory_lock=true
- xpack.security.enabled=true
- xpack.security.http.ssl.enabled=true
- xpack.security.http.ssl.key=certs/es01/es01.key
- xpack.security.http.ssl.certificate=certs/es01/es01.crt
- xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.http.ssl.verification_mode=certificate
- xpack.security.transport.ssl.enabled=true
- xpack.security.transport.ssl.key=certs/es01/es01.key
- xpack.security.transport.ssl.certificate=certs/es01/es01.crt
- xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.transport.ssl.verification_mode=certificate
- xpack.license.self_generated.type=${LICENSE}
mem_limit: ${MEM_LIMIT}
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test:
[
"CMD-SHELL",
"curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'",
]
interval: 10s
timeout: 10s
retries: 120

es02:
depends_on:
- es01
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
volumes:
- certs:/usr/share/elasticsearch/config/certs
- esdata02:/usr/share/elasticsearch/data
environment:
- node.name=es02
- cluster.name=${CLUSTER_NAME}
- cluster.initial_master_nodes=es01,es02,es03
- discovery.seed_hosts=es01,es03
- bootstrap.memory_lock=true
- xpack.security.enabled=true
- xpack.security.http.ssl.enabled=true
- xpack.security.http.ssl.key=certs/es02/es02.key
- xpack.security.http.ssl.certificate=certs/es02/es02.crt
- xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.http.ssl.verification_mode=certificate
- xpack.security.transport.ssl.enabled=true
- xpack.security.transport.ssl.key=certs/es02/es02.key
- xpack.security.transport.ssl.certificate=certs/es02/es02.crt
- xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.transport.ssl.verification_mode=certificate
- xpack.license.self_generated.type=${LICENSE}
mem_limit: ${MEM_LIMIT}
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test:
[
"CMD-SHELL",
"curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'",
]
interval: 10s
timeout: 10s
retries: 120

es03:
depends_on:
- es02
image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
volumes:
- certs:/usr/share/elasticsearch/config/certs
- esdata03:/usr/share/elasticsearch/data
environment:
- node.name=es03
- cluster.name=${CLUSTER_NAME}
- cluster.initial_master_nodes=es01,es02,es03
- discovery.seed_hosts=es01,es02
- bootstrap.memory_lock=true
- xpack.security.enabled=true
- xpack.security.http.ssl.enabled=true
- xpack.security.http.ssl.key=certs/es03/es03.key
- xpack.security.http.ssl.certificate=certs/es03/es03.crt
- xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.http.ssl.verification_mode=certificate
- xpack.security.transport.ssl.enabled=true
- xpack.security.transport.ssl.key=certs/es03/es03.key
- xpack.security.transport.ssl.certificate=certs/es03/es03.crt
- xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
- xpack.security.transport.ssl.verification_mode=certificate
- xpack.license.self_generated.type=${LICENSE}
mem_limit: ${MEM_LIMIT}
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test:
[
"CMD-SHELL",
"curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'",
]
interval: 10s
timeout: 10s
retries: 120

kibana:
depends_on:
es01:
condition: service_healthy
es02:
condition: service_healthy
es03:
condition: service_healthy
image: docker.elastic.co/kibana/kibana:${STACK_VERSION}
volumes:
- certs:/usr/share/kibana/config/certs
- kibanadata:/usr/share/kibana/data
- ./kibana.yml:/usr/share/kibana/config/kibana.yml
ports:
- ${KIBANA_PORT}:5601
environment:
- SERVERNAME=kibana
- ELASTICSEARCH_HOSTS=https://es01:9200
- ELASTICSEARCH_USERNAME=kibana_system
- ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
- ELASTICSEARCH_SSL_CERTIFICATEAUTHORITIES=config/certs/ca/ca.crt
mem_limit: ${MEM_LIMIT}
healthcheck:
test:
[
"CMD-SHELL",
"curl -s -I http://localhost:5601 | grep -q 'HTTP/1.1 302 Found'",
]
interval: 10s
timeout: 10s
retries: 120

volumes:
certs:
driver: local
esdata01:
driver: local
esdata02:
driver: local
esdata03:
driver: local
kibanadata:
driver: local
Loading