You are on page 1of 2

Summary: MESOS

Vyas Sundaresh Kovakkat


Electrical and Computer Engineering
University of Florida
Gainesville, Florida-32608
vyas.kovakkat@ufl.edu
I Introduction
The introduction paper on Mesos a platform
developed by University of Berkeley to improve the
cluster utilization and data replication across various
frameworks on a Commodity cluster. This paper also
talks about a new framework Spark, that provides a
better performance while executing jobs that have
iterative tasks that require input data.
Commodity clusters are cloud clusters that are
used as a computing resource for parallel computing
to achieve maximum computing output from
minimum cost[1][2]. They are generally low cost and
performance computing resources and by using
Mesos cluster sharing platform their overall
useability is increased. The existing cluster sharing
methods have an issue where the existing cluster
framework and the granularities of the sharing
solutions are incompatible. Mesos removes this issue
through its resource offer featurewhich
encapsulates a bundle of resources that a framework
can allocate on a cluster node to run a task. It
accomplishes this by exposing a common interface
and scheduling mechanism for all frameworks.
II Mesos Architecture and Working
Mesos architecture consists of a Master node that
bifurcates the work to the slave nodes using the
resources offer module. The information about the
list of free resources available on the slave nodes are
present with the resource offer. From the master node
the resources are allocated to executor module
existing on the slave nodes, on which the framework
jobs are being executed. The framework scheduler
sends the description of the task that would be
executed on the slave node. The framework on the
slave node decides which resources have to be used
to run the task and which have to be rejected.
Scheduler specifies this using the Boolean filter.
Rejection of resource offers are one of the many
ways Mesos enable framework to select resources
which satisfy a particular criteria. Due to this simple
architecture Mesos has a high scalability and fault
tolerance.
Since Master node is an important part of Mesos,
multiple master nodes are kept in active state using
the ZooKeeper, thereby achieving Fault tolerance.

Also a soft state of information of framework


scheduler, task executed and slaves are kept on the
Master node for recovery in case of failure.Slaves
also register multiple schedulers with the master and
notfies the master when one fails and the tasks are
moved to the next scheduler.
The resource allocation in Mesos are of 2 types i.)
fair sharing based on a generalized max-min for
multiple framework ii.) strict policies pertaining a
distribution of workload on various frameworks. The
task allocation depends on two factors i.) guaranteed
allocation the default amount of resources that a
framework can use for their task execution. If the
resources allocated are below this value then none of
the tasks are revoked, but if the resource allocated is
above the guaranteed allocation then the tasks may be
killed ii.) the amount of resources that a framework
may utilize if they were offered.
Mesos have a mechanism to enable robustness and
scalability i.) filters to predict whether a resource
bundle would be excepted or not. Resource that does
not pass this filter is considered a rejected resource
ii.) Mesos keeps track of the resources currently
allocated to a framework. This counting provides the
necessary delay required for a framework to analyze
and accept a resource iii.) In case of no response from
a framework during resource offer, the offer is then
forwarded to the remaining frameworks.
III Advantages and Disadvantages of Mesos
Some of the advantages of using Mesos are, firstly
on a single framework environment we can run
multiple versions of the same clutser. Secondly,
deployment of the resources are quicker. Thirdly,
sharing of the data across domains, hence useful in
the case where a framework targets a particular
problem.
Fourthly, Mesos is highly fault tolerant due to its
simple architecture and design. Using framework for
scheduling and execution has certain advantage as it
allows diverse approach to solve problems for eg:
using HADOOP and MPI framework together to
arrive at a solution. Changes required in integrating
the existing framework to Mesos is simple and
compact.

Disadvantages of Mesos is that the rejection


mechanism can slow the waiting time for a suitable
resource from the master, thereby slowing the overall
performance of the jobs. Also if a resource gets
rejected by a framework, it would have to be sent to
multiple frameworks before it gets accepted. Hence
reducing the overall utilization of the cluster. Another
disadvantage of Mesos distributed model is that in
certain cases small tasks occupy the slots on a
particular node on a regular basis, causing larger ones
not to get any slots for execution of their tasks. This
causes the framework to have a mechnism to predict
task times and handle failures of tasks. It has also
caused a limit on the execution time to be set for each
framework in case of a heterogenous task.
IV Research results
The experimental results shows that cluster sharing
has a better performance in comparison to static
partitioning yield. This is because for iterative jobs
frameworks are forced to take in the input data from
the nodes beyond their partition, while that is not the
case with the Mesos where the data is available
locally. Also it is found that the elasctic framework
like HADOOP and Spark perform well on Mesos in
comparison to Rigid framework like Torque or MPI.
The research has also done a performance analysis
between HADOOP and Spark for iterative jobs. It has
found that the Spark has 10 times the performance in
comparison to Hadoop for a logistic regression job.
The OS isolation mechanism used by Mesos for
running tasks from different framworks on the slave
environment has been fruitful, as it has shown a
descrease in request latency. Also the experiment

conducted has shown that Mesos is highly resilent to


fault tolerance and that the secondary master node
starts running in less than 10 seconds[3].
V Shortcomings
Some of the short comings of the research paper
according to me are, firstly, the resource allocation to
a framework has to be made dynamic, and at runtime
depending on the job at hand instead of allocating the
resource to each and every framework available.
Secondly, sharing of job between framework has to
enabled so that if it a time taking task can be retried
on a different framework rather failing the job. Also
the research has to be done providing a Mesos
scheduling architecture that is flexible to any
changes in new frameworks that may arise in the
future.
REFERENCES
[1]CLUSTER
COMPUTING:
SUPERCOMPUTER:
MARK
RAJKUMAR BUYYA2

THE
COMMODITY
BAKER1
AND

[2]SUPER-SERVERS:
COMMODITY
COMPUTER
CLUSTERS POSE A SOFTWARE CHALLENGE JIM GRAY

[3] MESOS: A PLATFORM

FOR FINE-GRAINED
RESOURCE SHARING IN THE DATA CENTER:
BENJAMIN HINDMAN, ANDY KONWINSKI, MATEI
ZAHARIA,ALI GHODSI, ANTHONY D. JOSEPH, RANDY
KATZ, SCOTT SHENKER, ION STOICA

You might also like