We at @SchweizerischeBundesbahnen have lots of productive apps running in our OpenShift environment. So we try really hard to avoid any downtime. So we test new things (versions/config and so on) in our test environment. As our test environment runs way less pods & traffic we created this tool to check all important OpenShift components under pressure, especially during a change.
Furthermore the daemon now also has a standalone mode. It runs checks based on a http call. So you can monitor all those things from an external monitoring system.
- UI: The UI to controll everything
- Hub: The backend of the UI and the daemons
- Daemon: Deploy them as DaemonSet & manually on master & nodes
- NODE = On a Node as systemd-service
- MASTER = On a master as systemd-service
- POD = Runs inside a docker container
TYPE | CHECK |
---|---|
MASTER | Master-API check |
MASTER | ETCD health check |
MASTER | DNS via kubernetes |
MASTER | DNS via dnsmasq |
MASTER | HTTP check via service |
MASTER | HTTP check via ha-proxy |
NODE | Master-API check |
NODE | DNS via kubernetes |
NODE | DNS via dnsmasq |
NODE | HTTP check via service |
NODE | HTTP check via ha-proxy |
POD | Master-API check |
POD | DNS via kubernetes |
POD | DNS via Node > dnsmasq |
POD | SDN over http via service check |
POD | SDN over http via ha-proxy check |
NAME | DESCRIPTION | EXAMPLE |
---|---|---|
UI_ADDR | The address & port where the UI should be hosted | 10.10.10.1:80 |
RPC_ADDR | The address & port where the hub should be hosted | 10.10.10.1:2600 |
MASTER_API_URLS | Names or IPs of your masters with the API port | https://master1:8443 |
DAEMON_PUBLIC_URL | Public url of your daemon | http://daemon.yourdefault.route.com |
ETCD_IPS | Names or IPs where to call your etcd hosts | https://localhost:2379 |
ETCD_CERT_PATH | Optional config of alternative etcd certificates path. This is used during certificate renew process of OpenShift to do checks with the old certificates. If this fails the default path will be checked as well | /etc/etcd/old/ |
NAME | DESCRIPTION | EXAMPLE |
---|---|---|
HUB_ADDRESS | Address & port of the hub | localhost:2600 |
DAEMON_TYPE | Type of the daemon out of [MASTER | NODE |
POD_NAMESPACE | The namespace if the daemon runs inside a pod in OpenShift | ose-mon-a |
oc new-project ose-mon-a
oc new-project ose-mon-b
oc new-project ose-mon-c
# Join projects a <> c
oc adm pod-network join-projects --to=ose-mon-a ose-mon-c
# Use the template install/ose-mon-template.yaml
# Do this for each project a,b,c
oc project ose-mon-a
# IMAGE_SPEC = If you want to use our image use "oscp/openshift-monitoring:version"
oc process -f ose-mon-template.yaml -p DAEMON_PUBLIC_ROUTE=xxx -p DS_HUB_ADDRESS=xxx -p IMAGE_SPEC=xxx | oc create -f -
mkdir -p /opt/ose-mon
# Download and unpack from releases or build it yourself (https://github.com/oscp/openshift-monitoring/releases)
chmod +x /opt/ose-mon/hub /opt/ose-mon/daemon
# Add your params to the service definition files
cp /opt/ose-mon/ose-mon-hub.service /etc/systemd/system/ose-mon-hub.service
cp /opt/ose-mon/ose-mon-daemon.service /etc/systemd/system/ose-mon-daemon.service
systemctl start ose-mon-hub.service
systemctl enable ose-mon-hub.service
systemctl start ose-mon-daemon.service
systemctl enable ose-mon-daemon.service
cd /opt/ose-mon
mkdir static
# The UI is included in the download above
- Do the same as above, just without the hub