editJingjing-Abhilash-bigdata

author: Jingjing Ren <renjj@ccs.neu.edu> 2016-12-13 15:44:24 -0500
committer: Jingjing Ren <renjj@ccs.neu.edu> 2016-12-13 15:44:24 -0500
commit: d481dd67059324d25a2af04214905d2bbac55995 (patch)
tree: cdb077cb21292b5f9867ed4c6c8e758e987a0337
parent: b214b7afb85a61ea6932bdf235062e8f784cc0df (diff)
1 files changed, 2 insertions, 2 deletions
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index 111b3a8..f51198f 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -119,7 +119,7 @@ The SiteData example{%cite chambers2010flumejava --file big-data %} shows that a
 
 
 ### 1.1.3 Dryad
-Dryad is a more general and flexible execution engine that execute subroutines at a specified graph vertices. Developers can specify an arbitrary directed acyclic graph to combine computational "vertices" with communication channels (file, TCP pipe, shared-memory FIFO) and  build a dataflow graph. Compared with MapReduce, Dryad can specify an arbitrary DAG that have multiple number of inputs/outputs and support multiple stages. Also it can have more channels and boost the performance when using TCP pipes and shared-memory. But like writing a pipeline of MapReduce jobs, Dryad is a low-level programming model and hard for users to program, thus a more declarative model - DryadLINQ  {%cite yu2008dryadlinq --file big-data %} was created to fill in the gap. It exploits LINQ, a query language in .NET and automatically translates the data-parallel part into execution plan and passed to the Dryad execution engine. Like MR, writing raw Dryad is hard, programmers need to understand system resources and other lower-level details. This motivates a more declarative programming model: DryadLINQ - a querying language.
+Dryad is a more general and flexible execution engine than MapReduce? that execute subroutines at a specified graph vertices. Developers can specify an arbitrary directed acyclic graph to combine computational "vertices" with communication channels (file, TCP pipe, shared-memory FIFO) and  build a dataflow graph. Compared with MapReduce, Dryad can specify an arbitrary DAG that have multiple number of inputs/outputs and support multiple stages. Also it can have more channels and boost the performance when using TCP pipes and shared-memory. But like writing a pipeline of MapReduce jobs, Dryad is a low-level programming model and hard for users to program, thus a more declarative model - DryadLINQ  {%cite yu2008dryadlinq --file big-data %} was created to fill in the gap. It exploits LINQ, a query language in .NET and automatically translates the data-parallel part into execution plan and passed to the Dryad execution engine. Like MR, writing raw Dryad is hard, programmers need to understand system resources and other lower-level details. This motivates a more declarative programming model: DryadLINQ - a querying language.
 
 ### 1.1.4 Spark
 
@@ -461,7 +461,7 @@ Hence, in Spark SQL, transformation of user queries happens in four phases :
 ## 3. Big Data Ecosystem
 *Hadoop Ecosystem*  
 
-Apache Hadoop is an open-sourced framework that supports distributed processing of large dataset. It involves a long list of projects that you can find in this table https://hadoopecosystemtable.github.io/. In this section, it is also important to understand the key players in the system, namely two parts: the Hadoop Distributed File System (HDFS) and the open-sourced implementation of MapReduce model - Hadoop.
+Apache Hadoop is an open-sourced framework that supports distributed processing of large dataset. It involves dozens of projects, all of which are listed [here](https://hadoopecosystemtable.github.io/). In this section, it is also important to understand the key players in the system, namely two parts: the Hadoop Distributed File System (HDFS) and the open-sourced implementation of MapReduce model - Hadoop.
 
 <figure class="main-container">
   <img src="./hadoop-ecosystem.jpg" alt="Hadoop Ecosystem" />
author	Jingjing Ren <renjj@ccs.neu.edu>	2016-12-13 15:44:24 -0500
committer	Jingjing Ren <renjj@ccs.neu.edu>	2016-12-13 15:44:24 -0500
commit	d481dd67059324d25a2af04214905d2bbac55995 (patch)
tree	cdb077cb21292b5f9867ed4c6c8e758e987a0337
parent	b214b7afb85a61ea6932bdf235062e8f784cc0df (diff)