Stateful Applications on Mantl Using the Mesos ELK Framework

Introduction

Throughout this week the discussion has been focused on the necessity of a programmable infrastructure and the Mantl project. Mantl was formulated to deliver application level flexibility by providing a set of pluggable components. One of the main use cases for enterprises is data management and analytics. This post discusses a subset of these components that are, arguably, the most important. This post will be talking about data platform and state.

What is the ELK stack?
Microservices are often idealized to be stateless. Stateless applications can happily fail and be restarted. Stateless applications can be scaled. But stateless applications don’t do anything interesting. Imagine a world full of stateless applications; no email, no google. Only static websites full of cat memes. Instead, the vast majority of applications require some state, usually in the form of a database.

The ELK stack is an offering from Elastic which includes the database Elasticsearch (ES), the Extract-Transform-Load tool Logstash (LS) a
nd the dashboarding application Kibana. ES is a general purpose NoSQL document datastore, which is easy to use and is highly scalable. The ELK stack is provided as a part of Mantl. This means the user is able to plug in the ELK stack and start using a distributed, scalable, resilient database, out of the box

What is Apache Mesos?
Mantl is operationally based upon Apache Mesos. Mesos has been discussed a number of times elsewhere, but here is a summary. Mesos is a resource abstraction layer. Applications request resources and these are placed upon a Mesos cluster. The key benefit is that it provides both scalability and resiliency. If more resource is required, add another node to the cluster. If one of the machines fail, the applications will be reallocated to different hosts.

So how is the Mantl ELK stack different from the normal ELK stack?
Fundamentally, the goal of the ELK stack on Mesos is to provide the same functionality as the official stack. The difference is that the Mantl version has been developed as a Mesos framework.

Mesos frameworks implement the core functionality of Mesos. They are responsible for scheduling work on the Mesos nodes, managing state and ensuring resiliency. Take the ES framework for example. The Mesos Elasticsearch framework provides users with a complete, scalable, resilient ES cluster. The framework can be started with a multitude of configuration options; from ES resources to the number of nodes in the cluster. The framework then takes care of the orchestration and creates an operational ES cluster. Once up and running, the cluster can be used in the same way as if it were installed from the official binaries.

Within the ES cluster, the data is replicated. The replication is configurable, but by default all data is replicated to all other machines. This provides the maximum amount of resilience to failure. For example if you have nine nodes in your cluster, nine separate machines would have to fail, all at the same time. If one node fails, it is instantly detected by the framework and reinstantiated. If the machine is no longer available, it will reinstantiate on a different host.

The diamond in the ES framework’s crown, however, is scalability. Because the data is replicated to all other machines, it is simple to scale the number of ES nodes down to one, back up to 49 and back down to 3, with zero data loss and downtime. All of this is achieved by a single API call.

This enables the advanced scheduling of databases. Imagine that a database specification has been over-prescribed. With Mantl, it is a simple API call to reduce the number of nodes. Imagine that there is a sudden surge in demand for a service, and ES starts to struggle to keep up with the number of reads. Simply make an API call to increase the number of nodes to spread the load.

I’ve been use ES as an example, but this is also true for LS and Kibana.

The LS framework is delivered setup to forward any messages written by any Mesos task to elasticsearch. But it can easily be configured to deliver application specific logs through the help of a configuration file. The LS framework is able to monitor a huge collection of log sources (e.g. syslog, Mesos, log file, etc.) so users don’t have to write any custom code to connect to logstash. Simply pass in a configuration file and log away.

Demo
Time for a demonstration of these capabilities. Below is a video showing a live Mantl cluster setup for the Cisco Live 2016 event.

Demo of Mantl ELK
Implementation details
Mesos exposes a significant amount of functionality in the framework interface. In fact, the framework interface is actually a combination of a scheduler and executor interfaces. The scheduler is responsible for orchestrating the cluster of executors and maintaining some sort of state. It may also provide a top-level API to interact with the rest of the cluster. The executors are responsible for completing a task. In this situation the respective executors are responsible for running the ES, LS and Kibana tasks. The interfaces are exposed in C++, Java and Python. However, for new projects, I would recommend looking at the HTTP interface, which decouples the code from the Mesos binaries

The image above shows an overview of the architecture of all the frameworks. The LS framework ensures that every agent has an instance of logstash (pink), so it can forward all the local file logs to ES. The ES framework forms a cluster (green) between the nodes. The framework is responsible for maintaining the validity of the cluster. If any of the agents or ES tasks fail, another task will be spawned on another node and re-clustering will commence. Finally, Kibana is running on any number of nodes (turquoise) and automatically connecting to the ES cluster. Thanks to Mantl, all elements have a consul DNS address, so Kibana and LS communicate with ES via a single HTTP endpoint.

For all frameworks there are two run-mode options. The frameworks default to using Docker containers (i.e. a docker container for the scheduler and multiple docker containers for the executors). Using containers benefits users by making the project easier to distribute, easier to test and simpler to lock down. The only dependency is Docker and Mesos. The frameworks also have a “JAR mode”, where the Java binaries are distributed and can be run directly on the infrastructure. For this mode all hosts required the Java 8 JRE and the correct Mesos libraries (which should be installed anyway). The benefit of JAR mode is that Docker is no longer required.

Out of the three components, the ES framework is the most mature but also the most complicated. ES has to form a cluster. In order to do this the cluster must be able to discover itself. Discovery in a multi-tenant, distributed system is a difficult challenge. And because ES discovery is core to ES’s operation, the discovery code must be a part of Elasticsearch. In order to maintain forward compatibility, the framework uses the in-built Zen discovery mechanism, in Unicast mode. The default Multicast mode is often blocked in cloud infrastructure systems. The scheduler provides ES with the addresses of the other nodes. Once a new node has connected to the cluster, it uses the Gossip protocol to discover the other nodes.

To be resilient and to scale down to one node, the nodes use a special ES setting that replicates all data to all nodes:
“`
index.auto_expand_replicas: 0-all
“`
This setting can be overridden by the user if they know that they don’t want to scale below a certain number of nodes.

The executors start an in-line version of ES using the Java client. The advantage of this is that the executor has direct control over the ES process, including the lifecycle. The downside is that this couples the project to a specific ES version. In future versions of the framework we will be looking to run the official ES docker image/binary in order to decouple the framework from ES version updates.

All of the framework state is written to Zookeeper. This means that if the scheduler fails, the restarted scheduler is able to query Zookeeper and quickly obtain the last known state of the cluster. The new scheduler will then send a health-check request to ensure that all the executors are still alive and none have been lost whilst the scheduler was away.

The authors of the ES framework code have made a significant effort to test all the major features of the framework using a project called Minimesos (www.minimesos.org). This enables the continuous delivery of new features with the confidence that there are no regressions.

For the LS framework, the architecture is slightly different. The goal of a LS executor is to monitor any number of logs for changes. In order to do that, the executor needs to be running on the machine it is trying to monitor. This means that the LS scheduler is responsible for running a logstash instance on every single Mesos agent (the name for a worker node). There are times when there aren’t enough resources to run an instance of the executor. For example, when one node’s resources have been allocated to one huge application. But this can be mitigated by using Mesos reservations, which is an API to tell Mesos to always save some resources for LS.

The Kibana framework is much simpler. Kibana is a true stateless application, so it doesn’t matter when or where it is created or destroyed. In fact, the Kibana framework is simply a wrapper for the official Kibana image, adding features such as orchestration and state.

Conclusions
The ELK stack on Mantl is a major step forward towards the ultimate goal of performance optimisation, resiliency and scalability.

Performance optimisation is possible due to the flexibility of the underlying Mantl architecture. When a resource is no longer required (CPU, RAM, disk), for example overnight when demand is low, machines can be turned off or terminated to save costs without affecting usability.

The system is resilient against failure. If any of the software or hardware fail, they will be reinstantiated immediately. This increases service availability to near 100%. If your application has become successful and the services are under load, it is very easy to scale up. Even with ES. The data is safe, due to ES’s shard replication strategies and snapshot/restore functionality. The data can be persisted to disk, or even a software defined storage solution to provide triple redundancy.

It would be fair to ask “why ELK?”. The ELK stack is unique in that it provides end-to-end, ETL-to-visualisation in a scalable form factor. The Elasticsearch database has broad enough functionality to suit a majority of use cases. And most important of all, each component can handle failure gracefully. Mantl, on the other hand, has been developed to make the process of creating a dynamic cluster as simple as possible. Mantl is a set of programmable infrastructure tools that users can easily script to create a cluster of their choosing. This makes it a perfect platform on which to automatically deploy the ELK stack.

More information
For more information on any of the projects discussed, please get in touch. You can find all the information introduced in this blog post and more from:
Mantl
elasticsearch
logstash
kibana