Apache Mesos

From Bauman National Library
This page was last modified on 21 December 2017, at 20:53.
Apache Mesos
Apache Mesos
Apache Mesos screenshot
Developer(s) Apache Software Foundation
Stable release
1.4.0 / 18.09.2017
Repository {{#property:P1324}}
Development status Active
Written in C++
Operating system Cross-platform
Type Computer cluster management
Website mesos.apache.org

Apache Mesos is a centralized fault-tolerant cluster management system designed for distributed computing environments in order to provide resource isolation and easy management of clusters of subordinate nodes (mesos slaves). [1]

This is an open cluster manager that simplifies the launch of applications on a scalable server cluster and is the heart of the Mesosphere system. Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. [2]

Pic. 1 Node abstraction in Apache Mesos


Mesos was formed as a research project at the University of California Berkeley (UC Berkeley Lab) graduate students Benjamin Hindman, Andy Konvinsk and Mateja Zacharias, as well as Professor Ian Stoic. Students began to work together on the project within the framework of the course on Advanced Topics in Computer Systems, taught by David Kuller. It was originally named Nexus, but due to a conflict with another university project, it was renamed Mesos.

Mesos was first introduced in 2009 by Andy Konvinski on HotCloud '09. Later in 2011, it was presented in a more mature state in a report by Mateja Zacharias at the USENIX Symposium on Networking Systems for the Design and Implementation of a Conference on the Work "Mesos: A Platform for Distributed Resource Sharing in the Data Center" by Benjamin Hindman, Andy Konvinski, Mateja Zacharia, Ali Godsie, Anthony D. Joseph, Randy Katz, Scott Schenker, Ian Stoic.

How it works

Pic. 2 Efficiency of Mesos

In a sense, the essence of Apache Mesos' work is the opposite of traditional virtualization: instead of dividing a physical machine into a set of virtual machines, it is suggested to combine them into a single whole, into a single virtual resource.

Mesos distributes CPU and memory resources in the cluster for tasks in a similar manner, as the Linux kernel allocates iron resources between local processes.

If you need to perform different types of tasks, you can select separate virtual machines (a separate cluster) for each type. These virtual machines will probably not be fully loaded and will stand idle for a while, that is, they will not work with maximum efficiency. If all virtual machines for all tasks are combined into a single cluster, you can increase the efficiency of resource utilization and, at the same time, increase the speed of their execution (if short-term or virtual machine tasks are not fully loaded all the time). A cluster Mesos (with a framework to it) is able to re-create individual resources, in case of their fall, scale resources manually or automatically under certain conditions, etc.


Mesos offers many of the features expected from a cluster manager, such as:[3]

  • Scalability to over 10,000 nodes
  • Resource isolation for tasks through Linux Containers
  • Efficient CPU and memory-aware resource scheduling
  • Apache ZooKeeper
  • Web UI for monitoring cluster state
  • Highly-available master server


The architecture of Apache Mesos consists of the demons master and slave (that is, leading and slave demons) and the framework.

A brief overview of these components and some important terms:

  • The master daemon (or Master daemon) runs on the master node (or node) and manages the slave daemons.
  • The slave daemon (or Slave daemon) runs on the master node and performs the tasks of the framework.
  • The framework (or the Mesos application) consists of a scheduler that combines with the master node to receive the resource offers of executors running tasks on the slave nodes. Examples of Mesos frameworks are Marathon, Chronos and Hadoop.
  • Offer is a list of available CPU resources and memory slave nodes. All slave nodes send offers to the master node, which passes them to the available frameworks.
  • Task - the unit of work planned by the framework that is executed on the slave node. The task can be anything, starting with a command or bash script and ending with SQL queries and Hadoop processes.
  • Apache ZooKeeper (or ZK): the software that is used to coordinate the master node.

  With these components, Apache Mesos accurately distributes cluster resources between applications according to their requirements. The amount of resources offered to a particular framework is determined according to the policy , set by the master node. The scheduler decides which of the offsets to use, and then he tells Mesos which tasks should be performed, and Mesos runs these tasks on the appropriate slave nodes. When tasks are completed and previously consumed resources are released, this cycle is repeated again to plan other tasks.

Pic. 3 Apache Mesos architecture from the client point of view
Pic. 4 Server Architecture Apache Mesos

Mesos Masters

The main monitoring servers of the cluster. Actually, they are responsible for providing resources, assigning tasks between the acting Mesos slaves. To ensure a high level of availability, there should be several and preferably an odd number, but of course more than 1. This is due to the level of the quorum. The active master (leader) at a certain time can be only one server.

Mesos Slaves

Services (nodes), with capacity to perform the tasks. Tasks can be performed in your own Mesos-containers, as well as in Docker.


The whole logic of running tasks, monitoring their performance, scaling, etc. perform frameworks. By analogy with Linux, it's such init / upstart-system to run processes. We will consider the work of Marathon framework, which is designed more for running regular tasks (long-term work of servers, etc.) or short-term. To start the task on schedule is to use a different framework - Chronos (by analogy with cron). There are a lot of frameworks, the most famous among them:

  • Aurora (knows how to run tasks on the schedule, and run long-term tasks). Developed by Twitter.
  • Hadoop
  • Jenkins
  • Spark
  • Torque
Pic. 5 Apache Mesos Frameworks


The daemon responsible for coordinating the Mesos Masters nodes. He holds elections of the master in the presence of a quorum. Other nodes in the cluster receive the address of the current wizard by querying the zookeeper group of the zk type node: // master-node1: 2138, master-node2: 2138, master-node3: 2138 / mesos. Mesos Slaves, in turn, also connect only to the current master, using a similar query.

Long Running Services

  • Aurora is a service scheduler that runs on top of Mesos, enabling you to run long-running services that take advantage of Mesos' scalability, fault-tolerance, and resource isolation.
  • Marathon is a private PaaS built on Mesos. It automatically handles hardware or software failures and ensures that an app is "always on."
  • Singularity is a scheduler (HTTP API and web interface) for running Mesos tasks: long running processes, one-off tasks, and scheduled jobs.
  • SSSP is a simple web application that provides a white-label "Megaupload" for storing and sharing files in S3.

Big Data Processing

  • Cray Chapel is a productive parallel programming language. The Chapel Mesos scheduler lets you run Chapel programs on Mesos.
  • Dpark is a Python clone of Spark, a MapReduce-like framework written in Python, running on Mesos.
  • Exelixi is a distributed framework for running genetic algorithms at scale.
  • Hadoop : Running Hadoop on Mesos distributes MapReduce jobs efficiently across an entire cluster.
  • Hama is a distributed computing framework based on Bulk Synchronous Parallel computing techniques for massive scientific computations e.g., matrix, graph and network algorithms.
  • MPI is a message-passing system designed to function on a wide variety of parallel computers.
  • Spark is a fast and general-purpose cluster computing system which makes parallel jobs easy to write.
  • Storm is a distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing.

Batch Scheduling

  • Chronos is a distributed job scheduler that supports complex job topologies. It can be used as a more fault-tolerant replacement for cron.
  • Jenkins is a continuous integration server. The mesos-jenkins plugin allows it to dynamically launch workers on a Mesos cluster depending on the workload.
  • JobServer is a distributed job scheduler and processor which allows developers to build custom batch processing Tasklets using point and click web UI.
  • Torque is a distributed resource manager providing control over batch jobs and distributed compute nodes.

Data Storage

  • Cassandra is a highly available distributed database. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.
  • ElasticSearch is a distributed search engine. Mesos makes it easy to run and scale.
  • Hypertable is a high performance, scalable, distributed storage and processing system for structured and unstructured data.


Trends such as cloud computing and big data are moving organizations away from consolidation and into situations where they might have multiple distributed systems dedicated to specific tasks.[4]
With the help of Docker executor for Mesos, Mesos can run and manage Docker containers in conjunction with Chronos and Marathon frameworks. Docker containers provide a consistent, compact and flexible means of packaging application builds. Delivering applications with Docker on Mesos promises a truly elastic, efficient and consistent platform for delivering a range of applications on premises or in the cloud.[5]

Well-known users

  • Twitter social network began using Mesos and Apache Aurora in 2010, after Benjamin Hindman addressed a group of Twitter engineers.
  • Airbnb announced in July 2013 that it uses Mesos to launch a number of data processing systems, such as Apache Hadoop and Apache Spark.
  • Online auction eBay announced in April 2014 that it uses Mesos to run continuous integration based on individual developers. They accomplish this by using a custom Mesos plug-in that allows developers to run their own instance of Jenkins.
  • In April 2015, it was announced that Apple's Siri service uses its own Mesos structure called Jarvis.
  • In August 2015, it was announced that Verizon had selected DCOS Mesosphere, which is based on Apache's open-source Mesos, as their nationwide platform for data center orchestration services.
  • In November 2015, Yelp announced that they used Mesos to produce services.


On the masters will also be the Marathon framework, which, if desired, can be placed on a separate node.

Marathon is the Mesos framework, which is designed to run long-running applications; In Mesosphere, this program serves as a replacement for the traditional init system. Marathon has many features that make it easy to launch applications in a clustered environment; Among these functions, high availability, node limitation, application diagnostics, an interface for scriptability and service discovery, and a user-friendly Web user interface can be distinguished. In addition, thanks to Marathon, the Mesosphere program has the functions of scaling and self-healing.

  • Instal package:
apt-get install software-properties-common
  • Add repositories:
apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)
echo "deb http://repos.mesosphere.com/${DISTRO} ${CODENAME} main" | \
  sudo tee /etc/apt/sources.list.d/mesosphere.list
  • Install Java
add-apt-repository ppa:webupd8team/java
apt-get update
apt-get install oracle-java8-installer
  • Install Mesos & Marathon:
apt-get -y install mesos marathon
  • Quorum level
echo "2" > /etc/mesos-master/quorum
  • Node IP
echo *.*.*.* | tee /etc/mesos-master/ip
cp /etc/mesos-master/ip /etc/mesos-master/hostname
  • Marathon properties:
mkdir -p /etc/marathon/conf
cp /etc/mesos-master/hostname /etc/marathon/conf
  • Restart services
restart mesos-master
restart marathon


  1. Apache Mesos, en.wikipedia. Reference date: 15.12.2017.
  2. Apache Mesos, mesos.apache.org. Reference date: 15.12.2017.
  3. An Introduction to Mesosphere, digitalocean. Reference date: 15.12.2017.
  4. An Introduction to Mesosphere, opensource.com. Reference date: 15.12.2017.
  5. Open source datacenter computing with Apache Mesos, sachinpbuzz.com. Reference date: 15.12.2017.