aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authormsabhi <abhi.is2006@gmail.com>2016-12-16 16:03:20 -0500
committerGitHub <noreply@github.com>2016-12-16 16:03:20 -0500
commit9b6258158687189aedd3e56b13b21191e0d1a35b (patch)
tree7c1588259a7fc8d136c5258540678d472b5c78bd
parentc890f3e6894ba3c4e61fbee2306784e1fbc0b07c (diff)
Update big-data.md
-rw-r--r--chapter/8/big-data.md5
1 files changed, 2 insertions, 3 deletions
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index db59fc0..12c3d54 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -706,13 +706,12 @@ Spark achieves fault tolerant, high throughput data streaming workloads in real-
*Apache Mesos*
-Apache Mesos{%cite hindman2011mesos --file big-data%} is an open source cluster/resource manager developed at the University of California, Berkley and used by companies such as Twitter, Airbnb, Netflix etc. for handling workloads in a distributed environment through dynamic resource sharing and isolation. It aids in the deployment and management of applications in large-scale clustered environments. Mesos abstracts node allocation by combining the existing resourcesĀ of the machines/nodes in a cluster into a single pool and enabling fault-tolerant elastic distributed systems. Variety of workloads can utilize the nodes from this single pool voiding the need of allocating specific machines for different workloads. Mesos is highly scalable, achieves fault tolerance through Apache Zookeeper and is a efficient CPU and memory-aware resource scheduler.
+Apache Mesos{%cite hindman2011mesos --file big-data%} is an open source heterogenous cluster/resource manager developed at the University of California, Berkley and used by companies such as Twitter, Airbnb, Netflix etc. for handling workloads in a distributed environment through dynamic resource sharing and isolation. It aids in the deployment and management of applications in large-scale clustered environments. Mesos abstracts node allocation by combining the existing resources of the machines/nodes in a cluster into a single pool and enabling fault-tolerant elastic distributed systems. Variety of workloads can utilize the nodes from this single pool voiding the need of allocating specific machines for different workloads. Mesos is highly scalable, achieves fault tolerance through Apache Zookeeper {%cite hunt2010zookeeper --file big-data%} and is a efficient CPU and memory-aware resource scheduler.
*Alluxio/Tachyon*
-Alluxio/Tachyon{% cite Tachyon --file big-data%} is an open source memory-centric distributed storage system that provides high throughput writes and reads enabling reliable data sharing at memory-speed across cluster jobs. Tachyon can integrate with different computation frameworks, such as Apache Spark and Apache MapReduce. In the big data ecosystem, Tachyon fits between computation frameworks or jobs like spark or mapreducce and various kinds of storage systems, such as Amazon S3, OpenStack Swift, GlusterFS, HDFS, or Ceph. It caches the frequently read datasets in memory, thereby avoiding going to disk to load every dataset. In Spark RDDs can automatically be stored inside Tachyon to make Spark more resilient and avoid GC overheads.
-
+Alluxio/Tachyon{% cite li2014tachyon --file big-data%} is an open source memory-centric distributed storage system that provides high throughput writes and reads enabling reliable data sharing at memory-speed across cluster jobs. Tachyon can integrate with different computation frameworks, such as Apache Spark and Apache MapReduce. In the big data ecosystem, Tachyon fits between computation frameworks or jobs like spark or mapreducce and various kinds of storage systems, such as Amazon S3, OpenStack Swift, GlusterFS, HDFS, or Ceph. It caches the frequently read datasets in memory, thereby avoiding going to disk to load every dataset. In Spark RDDs can automatically be stored inside Tachyon to make Spark more resilient and avoid GC overheads.