In the first part of this two-part blog series we explained the basic points of how an architecture based on microservices differs from an architecture based on a monolith.

srv-01, srv-02 and srv-03 are dead – long live „srv-n“!

When the time comes to start transferring your organization’s applications and workloads to dynamic environments or agile cloud infrastructures, regardless of whether they’re hosted externally or „on-premises“, it will also be time to adjust your monitoring accordingly.

In dynamic environments, where virtual systems are born and deleted automatically, the question for the monitoring system is no longer how to keep an eye on a single system but whether the services needed by the users are functioning at the promised level of performance. To do this, the monitoring system has to focus on correlations and aggregations, which can’t be supervised on the basis of simple status checks. The relationships between properties, events and status data have to be audited. The values returned by measurement checks can be extrapolated and related to each other using specific rules. On this basis the monitoring system can calculate whether the traffic light icons on the administrator’s workstation should display green or red.

The most popular monitoring tool is probably still either Nagios or one of its many forks, clones which have undergone a period of divergent evolution and are supported by a different developer team and community. There are several different monitoring tools on the market today. The most popular open source solutions are examined in our „Open Source Monitoring Guide“ (German, soon available in English) where we compare the tools‘ strengths and weaknesses. Depending on your particular scenario, your requirements and the daily processes of your organization, it will be necessary to conduct an evaluation to determine which tool is most suitable for our needs. For most situations we recommend Sensu.


Why migrate from Nagios to Sensu?

1. Sensu is compatible with Nagios plugins, ensuring that checks and tests conducted under Nagios can continue under Sensu. The extent and scope of your monitoring system remain intact. And by employing additional status and metric plugins specific to Sensu, you can even extend your current monitoring and add value to the system.

2. The microservices architecture which Sensu incorporates enables you to scale your monitoring environment seamlessly. Especially when compared with conventional monolithic architecture, which typically forces you to add resources when more performance is needed.

3. Sensu’s intelligent architecture places much more modest demands on your resources. Most monitoring systems comprise approximately 30 to 50 different checks. The majority of the tests for most systems or groups of systems (Windows server, Linux server, router, switches) is identical. Where Nagios plans and executes each check separately, Sensu can do this in either of two different ways, both of which proceed in a much more resource-efficient fashion:

The first possibility is the use of the so-called Publish/Subscribe (Pub/Sub) method. Checks are assigned suitable keywords and thereby organized into groups. Checks which are to be executed by all Windows servers for example might be assigned the keyword „windows“, webserver checks the keyword „webserver“, etc. Once Sensu schedules these checks for execution, all servers will run the checks that match their own keywords and report the results. Each defined test only has to be triggered once per interval, and because of the dependencies it will initiate a check on all the hundreds or thousands of servers it shares a keyword with. Since the check interval is generally identical for each kind of check and it is not necessary for an administrator to plan the cpu checks for each individual system, it often makes most sense to rely on the default configuration.

The second possibility allows the Sensu client, ie. the monitored system itself, to have responsibility for planning and executing the checks. These are referred to as „standalone checks“. In this scenario, the Sensu server only has to process the check results returned by the clients and is spared the necessity of planning and executing checks.

Regardless of which method is implemented, the workload on the Sensu server is significantly reduced as compared with the Nagios architecture. For complex requirements, it often makes sense to combine the methods.

4. The „directory-based configuration“ makes it astoundingly easy to manage your monitoring using automation tools such as Ansible, Chef, Puppet or SaltStack. If any of these are used to roll out new services to servers under your monitoring system’s supervision, the needed check, in JSON format, can be rolled out at the same time and stored in the Sensu server. The next time that Sensu client is rebooted, your new services will be monitored as well.

5. Sensu already puts a wide range of integration opportunities at your disposal. Both the Sensu server and the Sensu clients provide APIs. The server’s API is used for example to connect the dashboard which displays the devices and services included in Sensu. The clients‘ API can be used for example to pass on check results. Further connections include the storing of check data in Graphite so it can be used to display measurements in Grafana, or the sending of messages to Slack, IRC or Hipchat. Additional integrations include InfluxDB, Graylog, JIRA, DataDog, Chef, Librato, Puppet, Pagerduty, OpenTSDB, SNMP, ServiceNow, Wavefront, and many more.

6. Strategically perfect timing and solution. Sensu enables you to migrate from your „old world“ to your „new world“ without losing stride. You can take your own plugins with you and get more value out of them through improved data. A migration from Nagios to Sensu lets you take your existing checks on a smooth transition from a clumsy, inefficient architecture to a modern, flexible system. You don’t even lose Nagios‘ greatest asset: its tremendous variety of community plugins. You will also benefit from a lighter workload and the flexibility that Sensu provides – the best of both worlds!

Finally, you will have to consider whether your current monitoring should actually be retained at all. Migrations are usually carried out using scripts to minimize the time required and to guarantee the most faithful transition of content. In most areas scripts represent the best approach, but in monitoring, a migration represents an opportunity to clean out the garage with all those years‘ of accumulated monitoring and to review existing checks and scripts and replace them with new, higher performance and more productive plugins.