Skip to content

Commit

Permalink
public version
Browse files Browse the repository at this point in the history
  • Loading branch information
bobrik committed May 1, 2015
0 parents commit 3deb9a0
Show file tree
Hide file tree
Showing 6 changed files with 236 additions and 0 deletions.
11 changes: 11 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM alpine:3.1

RUN apk --update add collectd collectd-python py-pip && \
pip install envtpl

COPY ./collectd.conf.tpl /etc/collectd/collectd.conf.tpl
COPY ./mesos-tasks.py /usr/share/collectd/plugins/mesos/

COPY ./run.sh /run.sh

ENTRYPOINT ["/run.sh"]
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
The MIT License (MIT)

Copyright (c) 2015 Ian Babrou <[email protected]>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
54 changes: 54 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Collect task resource usage in mesos

This is collectd plugin and docker image to collect resource usage
from mesos tasks. Resource usage collected from mesos slaves and sent
to graphite installation.

Yu have to add `collectd_app` label with the application name to your tasks
to make it visible in graphite. Marathon 0.8.0+ and mesos 0.22+ support that.

Also make sure to check out docker image to collect metrics from masters
and slaves: [collectd-mesos](https://github.com/bobrik/docker-collectd-mesos).

## Reported metrics

Metric names look line this:

```
collectd.<host>.mesos-tasks.<app>.<task>.<type>.<metric>
```

Gauges:

* `cpus_limit`
* `cpus_system_time_secs`
* `cpus_user_time_secs`
* `mem_limit_bytes`
* `mem_rss_bytes`

## Running

Minimal command:

```
docker run -d -e GRAPHITE_HOST=<graphite host> -e MESOS_HOST=<mesos host> \
bobrik/collectd-mesos-tasks
```

### Environment variables

* `COLLECTD_HOST` - host to use in metric name, defaults to the value of `MESOS_HOST`.
* `COLLECTD_INTERVAL` - metric update interval in seconds, defaults to `10`.
* `GRAPHITE_HOST` - host where carbon is listening for data.
* `GRAPHITE_PORT` - port where carbon is listening for data, `2003` by default.
* `GRAPHITE_PREFIX` - prefix for metrics in graphite, `collectd.` by default.
* `MESOS_HOST` - mesos slave host to monitor.
* `MESOS_PORT` - mesos slave port number, defaults to `5051`.

Note that this docker image is very minimal and libc inside does not
support `search` directive in `/etc/resolv.conf`. You have to supply
full hostname in `MESOS_HOST` that can be resolved with nameserver.

## License

MIT
34 changes: 34 additions & 0 deletions collectd.conf.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Hostname "{{ COLLECTD_HOST | default(MESOS_HOST) }}"

FQDNLookup false
Interval {{ COLLECTD_INTERVAL | default(10) }}
Timeout 2
ReadThreads 5

LoadPlugin write_graphite
<Plugin "write_graphite">
<Carbon>
Host "{{ GRAPHITE_HOST }}"
Port "{{ GRAPHITE_PORT | default("2003") }}"
Protocol "tcp"
Prefix "{{ GRAPHITE_PREFIX | default("collectd.") }}"
EscapeCharacter "."
StoreRates true
AlwaysAppendDS false
SeparateInstances true
</Carbon>
</Plugin>

<LoadPlugin "python">
Globals true
</LoadPlugin>

<Plugin "python">
ModulePath "/usr/share/collectd/plugins/mesos"

Import "mesos-tasks"
<Module "mesos-tasks">
Host "{{ MESOS_HOST }}"
Port {{ MESOS_PORT | default(5051) }}
</Module>
</Plugin>
106 changes: 106 additions & 0 deletions mesos-tasks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#! /usr/bin/env python

import collectd
import json
import urllib2

CONFIGS = []

METRICS = [
"cpus_limit",
"cpus_system_time_secs",
"cpus_user_time_secs",
"mem_limit_bytes",
"mem_rss_bytes"
]

def configure_callback(conf):
"""Receive configuration"""

host = "127.0.0.1"
port = 5051

for node in conf.children:
if node.key == "Host":
host = node.values[0]
elif node.key == "Port":
port = int(node.values[0])
else:
collectd.warning("mesos-tasks plugin: Unknown config key: %s." % node.key)

CONFIGS.append({
"host": host,
"port": port,
})

def fetch_json(url):
"""Fetch json from url"""
try:
return json.load(urllib2.urlopen(url, timeout=5))
except urllib2.URLError, e:
collectd.error("mesos-tasks plugin: Error connecting to %s - %r" % (url, e))
return None

def fetch_metrics(conf):
"""Fetch metrics from slave"""
return fetch_json("http://%s:%d/monitor/statistics.json" % (conf["host"], conf["port"]))

def fetch_state(conf):
"""Fetch state from slave"""
return fetch_json("http://%s:%d/state.json" % (conf["host"], conf["port"]))

def read_stats(conf):
"""Read stats from specified slave"""

metrics = fetch_metrics(conf)
state = fetch_state(conf)

if metrics is None or state is None:
return

tasks = {}

for framework in state["frameworks"]:
for executor in framework["executors"]:
for task in executor["tasks"]:
info = {}

labels = {}
if "labels" in task:
for label in task["labels"]:
labels[label["key"]] = label["value"]

info["labels"] = labels

tasks[task["id"]] = info

for task in metrics:
if task["source"] not in tasks:
collectd.warning("mesos-tasks plugin: Task %s found in metrics, but missing in state" % task["source"])
continue

info = tasks[task["source"]]
if "collectd_app" not in info["labels"]:
continue

app = info["labels"]["collectd_app"].replace(".", "_")
instance = task["source"].replace(".", "_")

for metric in METRICS:
if metric not in task["statistics"]:
continue

val = collectd.Values(plugin="mesos-tasks")
val.type = "gauge"
val.plugin_instance = app + "." + instance
val.type_instance = metric
val.values = [task["statistics"][metric]]
val.dispatch()

def read_callback():
"""Read stats from configured slaves"""
for conf in CONFIGS:
read_stats(conf)

collectd.register_config(configure_callback)
collectd.register_read(read_callback)
10 changes: 10 additions & 0 deletions run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/sh

set -e

if [ ! -e "/.initialized" ]; then
touch "/.initialized"
envtpl /etc/collectd/collectd.conf.tpl
fi

collectd -f

0 comments on commit 3deb9a0

Please sign in to comment.