From 768f7e51fd7d6bafdc5658b86503463f7e4a2486 Mon Sep 17 00:00:00 2001
From: msabhi <abhi.is2006@gmail.com>
Date: Sun, 4 Dec 2016 07:41:04 -0500
Subject: Updated outline

---
 chapter/8/big-data.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index c6baac0..cc11e28 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -14,13 +14,13 @@ This chapter is organized in
     - PM of Dryad: can support DAG computation, limitations: low-level, `Q: Should this go to execution model?`
     - PM of Spark, RDD/lineage: can support iterative algorithm, interactive analytics
   - Large-scale Parallelism on Graphs
-    - PM of Pregel/GraphX
+    - PM of Pregel
   - Querying: more declarative `Q: put here or in the execution model?`
     - DryadLINQ, SQL-like, use Dryad as execution engine;
     - Pig, on top of Hadoop, independent of execution platform, in theory can compiled into DryadLINQ too; what is the performance gain/lost? Easier to debug?
     - Hive, SQL-like, on top of Hadoop, what is the performance gain/lost.
     - Dremel, query natively w/o translating into MP jobs
-    - Spark SQL, on top of Spark
+    - Spark SQL - how is it different from other above models? How does it leverage Spark execution engine and enhanced RDDs like data frames? what are its goals? whats a Dataframe API and how is it different from a RDD?
 
 - Execution Models
   - MapReduce (intermediate writes to disk)
-- 
cgit v1.2.3