update outline

author: Jingjing Ren <renjj@ccs.neu.edu> 2016-12-05 10:56:29 -0500
committer: Jingjing Ren <renjj@ccs.neu.edu> 2016-12-05 10:56:29 -0500
commit: 09ae3171dcc60933ed9a1bc3ebf27e6611423626 (patch)
tree: 1a50f6a4f03f476f18287760ae4ed49e5bc2a6c6
parent: d64b5eea953b10e02e0c9bc232a7b2a803addbdd (diff)
1 files changed, 2 insertions, 2 deletions
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index 54dde79..608341e 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -30,7 +30,7 @@ by: "Jingjing and Abhilash"
   - Graphs :
     - Pregel :Overview of Pregel. Its implementation and working. its limitations. Do not  stress more since we have a better model GraphX to explain a lot.
     - GraphX : Working on this.
- - SparkSQL Catalyst & Spark execution model : Discuss Parser, LogicalPlan, Optimizer, PhysicalPlan, Execution Plan. Why catalyst? how catalyst helps in SparkSQL , data flow from sql-core-> catalyst->spark-core
+  - SparkSQL Catalyst & Spark execution model : Discuss Parser, LogicalPlan, Optimizer, PhysicalPlan, Execution Plan. Why catalyst? how catalyst helps in SparkSQL , data flow from sql-core-> catalyst->spark-core
 
 - Evaluation: Given same algorithm, what is the performance differences between Hadoop, Spark, Dryad? There are no direct comparison for all those models, so we may want to compare separately:
   - Hadoop vs. Spark
@@ -77,7 +77,7 @@ reduce(String key, Iterator values):
   Emit(AsString(result));
 ```
 
-*Execution*  
+*Execution*  `TODO: move this to execution and talk about fault-tolerance instead`
 At high level, when the user program calls *MapReduce* function, the input files are split into *M* pieces and it runs *map* function on corresponding splits; then intermediate key space are partitioned into *R* pieces using a partitioning function; After the reduce functions all successfully complete, the output is available in *R* files. The sequences of actions are shown in the figure below. We can see from label (4) and (5) that the intermediate key/value pairs are written/read into disks, this is a key to fault-tolerance in MapReduce model and also a bottleneck for more complex computation algorithms.  
 
 <figure class="main-container">
author	Jingjing Ren <renjj@ccs.neu.edu>	2016-12-05 10:56:29 -0500
committer	Jingjing Ren <renjj@ccs.neu.edu>	2016-12-05 10:56:29 -0500
commit	09ae3171dcc60933ed9a1bc3ebf27e6611423626 (patch)
tree	1a50f6a4f03f476f18287760ae4ed49e5bc2a6c6
parent	d64b5eea953b10e02e0c9bc232a7b2a803addbdd (diff)