Thank you for visiting the Rackspace Community
The The Community is live! Post new content or topics so our teams can assist.

Please contact your support team if you have a question or need assistance for any Rackspace products, services, or articles.

How to set up monitoring stack using CollectD, Graphite, Grafana and Seyren on Ubuntu 14.04

Introduction

This guide is an alternative solution for Rackspace Cloud Monitoring product using Open Source software: CollectD, Graphite, Grafana and Seyren. One of the main advantages of this monitoring stack is that you can monitor networking performance and other system data on instances with no public network access, i.e. only running on Isolated and/or ServiceNet network. Other advantage is that monitoring metrics and other stack components are highly customizable, scalable and should be flexible enough to monitor any application or workload.

 Architecture

In this guide we’ll be setting up monitoring for a couple of nodes which are on isolated network 192.168.3.0/24. Monitoring metrics will be pushed by collectd agent to graphite server via carbon-cache daemon listening on TCP port 2003. Graphite-web will read whisper files for data points and allow them to be accessed by Seyren and Grafana. All UI dashboards will be accessible via different ports on public interface 134.213.218.173. Graphite server will be running on SSD disk as it can be quite IO intensive due to a high number of Whisper files that will be written to.

 Basic Architecture

Software choices:

CollectD is a daemon which collects system performance statistics and provides mechanisms to store values in a variety of ways. We will send collectd metrics into carbon/graphite using collectd write_graphite plugin. There are a number of other collectors like Diamond, or metric aggregators like StatsD, but they are not included in this set up for simplicity reasons.

 Graphite is highly scalable real-time graphing system. It consists of 3 software components:

1.carbon-cache – A network service that listens for incoming metrics. It stores the metrics temporarily in a memory buffer-cache for a brief period before flushing them to disk in the form of the Whisper database format.

2.whisper- a simple database library for storing time-series data (similar in design to RRD)

3.graphite webapp - A Django webapp that renders graphs on-demand using Cairo

Grafana - graphite dashboard replacement with cool graph editing and dashboard creation UI. It pulls all needed data from graphite.

Seyren – alerting dashboard for Graphite. It supports notifications via Email, Flowdock, HipChat, HTTP, Hubot, IRCcat, PagerDuty, Pushover, SLF4J, Slack, SNMP, Twilio.

 

Install Graphite

On Graphite Server install graphite packages:

sudo apt-get update
sudo apt-get install graphite-web graphite-carbon 

 

Install and configure PostgreSQL

By default Graphite uses SQLite to store user info, permissions, graph and dashboard configurations, but SQLite is not recommended for production. For such environments we recommend to use RDBMS like MySQL or PostgreSQL. In this guide will use the latter as it's got complete support for full-text indexing and searching.

First of all we install Python libraries that Graphite will use to connect to and communicate with the database:

sudo apt-get install postgresql libpq-dev python-psycopg2

 Switch to postgres user and create DB user graphite_user

su - postgres
createuser graphite_user --pwprompt

 Create graphite_db and grafana_db databases owned by graphite_user:

createdb -O graphite_user graphite_db
createdb -O graphite_user grafana_db

Once this is done, you can switch back to previous user:

logout

 

Configure Graphite

Once we have required packages and DB set up, we can configure Graphite:

Edit file /etc/graphite/local_settings.py  and change DATABESES values we configured in PostgreSQL:

 DATABASES = {
'default': {
   'NAME': 'graphite_db',
   'ENGINE': 'django.db.backends.postgresql_psycopg2',
   'USER': 'graphite_user',
   'PASSWORD': 'graphite_user_password',
   'HOST': '127.0.0.1',
   'PORT': ''
   }
}

 Uncomment SECRET_KEY line and set key for hashing:

SECRET_KEY = 'secret_key_for_salting_hashes'

 Uncomment and set time zone which will be displayed on graphs:

TIME_ZONE = 'Europe/London'

Enable authentication to save graph data:

USE_REMOTE_USER_AUTHENTICATION = True

 

After saving and closing the file, we need to sync the database:

sudo graphite-manage syncdb

Select “Yes” when it prompts to create a superuser account, as it will be used to log into Graphite’s interface.

 

Configure Carbon

Carbon is a storage backend of a Graphite installation. In simple installations, there is typically only one daemon, carbon-cache.py. It listens for time-series data and can accept it over Plaintext and other protocols.

 Main configuration file is /etc/carbon/carbon.conf. We would need to edit it to configure carbon to listen on isolated network interface 192.168.3.12:

sudo sed -i.bak s/0.0.0.0/192.168.3.12/g /etc/carbon/carbon.conf

 Copy default storage aggregation file to carbon directory:

sudo cp /usr/share/doc/graphite-carbon/examples/storage-aggregation.conf.example /etc/carbon/storage-aggregation.conf

Enable Carbon to start on boot by editing file /etc/default/graphite-carbon and changing CARBON_CACHE_ENABLED to true:

CARBON_CACHE_ENABLED=true

We save the file and start carbon-cache daemon:

sudo service carbon-cache start

Note! Very important to note is that Carbon Storage schemas can be added and configured in file /etc/carbon/storage-schemas.conf. I will explain more about aggregation and data point storage in section where we install and configure collectd.

Install and configure Apache + wsgi

Since Django recommends using WSGI as the middleware service, we can run it on Apache with mod_wsgi, nginx with Gunicorn, or nginx with uWSGI. We will be installing Apache because it has good logging support and authentication modules.

 Installing apache packages:

sudo apt-get install apache2 libapache2-mod-wsgi

 Disable default Apache site:

sudo a2dissite 000-default

Copy Graphite’s virtual host template to Apache’s available sites directory:

sudo cp /usr/share/graphite-web/apache2-graphite.conf /etc/apache2/sites-available

 Enable Graphite virtual host and reload Apache to implement changes:

sudo a2ensite apache2-graphite
sudo service apache2 reload

We can now access Graphite interface by browsing to http://your_ip_address.


 

We will see that carbon already started uploading some of it’s stats to Graphite. In order to get more metrics we will install collectd.

 

Install and configure CollectD

 Collectd is simply a daemon which collects system performance statistics and sends it to Graphite. It’s easy to configure and has a high number of plugins. It uses Plaintext protocol to send data series to Graphite/Carbon.

Install collectd packages:

sudo apt-get install collectd collectd-utils

 Edit /etc/collectd/collectd.conf file and enable these plugins to collect various system data and push it to graphite:

LoadPlugin cpu
LoadPlugin df
LoadPlugin interface
LoadPlugin load
LoadPlugin memory
LoadPlugin ping
LoadPlugin write_graphite

 Some plugins have default configuration while others need to be enabled by uncommenting “LoadPlugin” line and configured by uncommenting required lines like Ping Plugin below:

<Plugin ping>
       Host "192.168.3.11"
       Interval 1.0
       Timeout 0.9
       TTL 255
       MaxMissed -1
</Plugin>

 This plugin will monitor ping latency, packet loss and deviation to IP 192.168.3.11 (more “Host” lines can be added to monitor multiple IPs or domain names), sending ICMP echo every second.

Next plugin we’ll need to configure is “write_graphite”. It’s used to push data to graphite server which is listening on TCP port 2003 on 192.168.3.12 IP, there can be multiple “Node” stanzas to push to multiple graphite servers.

<Plugin write_graphite>
        <Node "graphite">
                Host "192.168.3.12"
                Port "2003"
                Protocol "tcp"
                LogSendErrors true
                Prefix "collectd."
                StoreRates true
                AlwaysAppendDS false
                EscapeCharacter "_"
        </Node>
</Plugin>

Since we add “collectd.” prefix, all pushed data from this node will have the following naming convention: collectd.{hostname}.{collectd_plugin}.{check}

For example graphite server's ping packet loss check would start with the file name collectd.graphite.ping.ping_droprate-192_168_3_11.. and on the filesystem will be stored as: /var/lib/graphite/whisper/collectd/graphite/ping/ping_droprate-192_168_3_11.wsp.

Note! Above installation and configuration of collectd will have to be done on all of the nodes you intend to monitor. I installed them also on “Node1” and “Node10” (see architecture schema in the beginning of this guide), but since installation is identical to above, I will not repeat.

 Storage schema configuration

On Graphite server edit file /etc/carbon/storage-schemas.conf to configure storage parameters, add [collectd] stanza below [carbon] parameters, but before [default_1min_for_1day] stanza:

[collectd]
pattern = ^collectd.*
retentions = 10s:1h,1m:1d,10m:1y

 Pattern parameter “^collectd.*” will match “Prefix” value which was set in write_graphite plugin config in /etc/collectd/collectd.conf file.

Retention values of 10s:1h,1m:1d,10m:1y mean that data points will be stored in 3 separate sets. For example 10s:1h set means that if we select to view graphs up to “last 60 minutes” then data points will be spread in 10 second interval. From 61 minutes up to “last 24 hours” graph view will show data points spread 1 minute interval and up to 1 year view will show data points data points every 10 minutes.

For example storage and aggregation diagram would look like this:

  

For changes to take affect, please restart carbon-cache service :

sudo service carbon-cache stop
sudo service carbon-cache start
sudo service collectd restart

 

 

Install and configure Grafana

Add the following line to your /etc/apt/sources.list file.

deb https://packagecloud.io/grafana/stable/debian/ wheezy main

Add the Package Cloud key to be able to install signed packages:

curl https://packagecloud.io/gpg.key | sudo apt-key add -

 Update your Apt repositories and install Grafana:

sudo apt-get update
sudo apt-get install grafana

You can configure to listen on any port, but we will show how to use SSL and configure Grafana to listen on port 443:

cd /etc/grafana
sudo openssl req -x509 -newkey rsa:2048 -keyout cert.key -out cert.pem -days 3650 -nodes

Edit file /etc/grafana/grafana.ini and set the following:

# https certs & key file
cert_file = /etc/grafana/cert.pem
cert_key = /etc/grafana/cert.key

protocol = https
http_port = 443

 Add database settings:

[database]
# Either "mysql", "postgres" or "sqlite3", it's your choice
type = postgres
host = 127.0.0.1:5432
name = grafana_db
user = graphite_user
password = graphite_user_password

Save and close the file.

 Set grafana to be able to run on any port:

 sudo setcap 'cap_net_bind_service=+ep' /usr/sbin/grafana-server

 Configure Grafana server to start on boot:

sudo update-rc.d grafana-server defaults 95 10

 Start Grafana server

sudo service grafana-server start

 Login to Grafana on port 443 https://your_ip_address using user “admin” and password “admin”


 

Change admin user password:
Click on Grafana icon on the top left ->admin->Change password

 

Set Graphite as a datasource for Grafana and "Test Connection":



 

 

Before creating any graphs, I would recommend to watch this 10 minute Grafana dashboard tutorial, this will make your life much easier. 

For example below is a ping latency graph. We created it using ping check metrics that Graphite node received from Node1 and Node10. The view is set to "Last 12 hours", so data points are spread in 1 minute intervals.

 


 

 

Install and configure Seyren

 Seyren is a simple alerting dashboard for Graphite. It reacts to configured monitoring thresholds and sends notifications via various systems: Email, Flowdock, HipChat, HTTP, Hubot, IRCcat, PagerDuty, Pushover, SLF4J, Slack, SNMP, Twilio.

 

Actual Seyren installation guide can be found here.

Install MongoDB version 3.0. At the moment (27/Dec/2015) Seyren is not compatible with current MongoDB version 3.2. Issue on Seyren repo can be found here.

Mongo installation:
https://docs.mongodb.org/v3.0/tutorial/install-mongodb-on-ubuntu/

 

Download Seyren:

wget https://github.com/scobal/seyren/releases/download/1.3.0/seyren-1.3.0.jar

 Create Seyren environment variables file:

export SEYREN_URL="http://134.213.218.173:8080/seyren"
export SEYREN_LOG_PATH="/var/log/seyren"
export SEYREN_LOG_FILE_LEVEL="trace"
export GRAPHITE_URL="http://134.213.218.173:80" 

Start seyren in background:

java -jar seyren-1.3.0.jar &

 Create Seyren check, for “Target” field you would need to use Graphite’s naming convention, for example collectd.graphite.ping.ping-192_168_3_11:

  

 

Once all this is set up, you can configure subscription for your notifications for your alerts, for example via email:

 

Conclusion

It's not too complicated to create the above set up and it can be quite powerful and scalable monitoring solution with a lot of flexibility. All the software is free and opensource, so can be easily adapted to any environment.