add outline

author: Jingjing Ren <renjj@ccs.neu.edu> 2016-12-03 14:22:02 -0500
committer: Jingjing Ren <renjj@ccs.neu.edu> 2016-12-03 14:22:02 -0500
commit: f3a6c2d3a2ba08070f79c03a518cab874a0fc27f (patch)
tree: ec3e8fdab11339f2e8d078a9ce01d3fd711549af /chapter/8
parent: 175a0fae9c43c111bb02842d5b01bbb15daa8cee (diff)
1 files changed, 3 insertions, 4 deletions
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index 23f47b5..c63a300 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -13,9 +13,9 @@ This chapter is organized in by
 
 - Programming Models
   - Data parallelism (most popular, standard map/reduce/functional pipelining)
-    - Limitations, iteration difficult due to the execution model of MapReduce/Hadoop
-    - Graphs
-    - Querying
+      - Limitations, iteration difficult due to the execution model of MapReduce/Hadoop
+  - Graphs
+  - Querying
 - Execution Models
   - MapReduce (intermediate writes to disk)
     - Limitations, iteration, performance
@@ -78,7 +78,6 @@ The output from distributed computation should be same as one from non-faulting
 
 There are some practices in this paper that make the model work very well in Google, one of them is **backup tasks**: when a MapReduce operation is close to completion, the master schedules backup executions of the remaining in-progress tasks ("straggler"). The task is marked as completed whenever either the primary or the backup execution completes.
 
-`JJ: what about other refinement: `
 
 **Performance**  
 In the paper, the authors measure the performance of MapReduce on two computations running on a large cluster of machines. One computation *grep* through approximately 1TB of data. The other computation *sort* approximately 1TB of data. Both computations take in the order of a hundred seconds. In addition, the backup tasks do help largely reduce execution time. In the experiment where 200 out of 1746 tasks were intentionally killed, the scheduler was able to recover quickly and finish the whole computation for just a 5% increased time.
author	Jingjing Ren <renjj@ccs.neu.edu>	2016-12-03 14:22:02 -0500
committer	Jingjing Ren <renjj@ccs.neu.edu>	2016-12-03 14:22:02 -0500
commit	f3a6c2d3a2ba08070f79c03a518cab874a0fc27f (patch)
tree	ec3e8fdab11339f2e8d078a9ce01d3fd711549af /chapter/8
parent	175a0fae9c43c111bb02842d5b01bbb15daa8cee (diff)