updating query section

author: msabhi <abhi.is2006@gmail.com> 2016-12-15 01:35:48 -0500
committer: GitHub <noreply@github.com> 2016-12-15 01:35:48 -0500
commit: 7d565b86c499491bc18e5fa1c439744eed056007 (patch)
tree: 99cb69f11eb66ca462175b4abd093654243cbb06 /chapter
parent: adb64f799c47d47804f0faddec29277ce05b5461 (diff)
1 files changed, 0 insertions, 7 deletions
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index 2fd3e59..511c7dd 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -230,13 +230,6 @@ Apart from Sawzal, Pig  {%cite olston2008pig --file big-data %} and Hive  {%cite
 
 Hive is built by Facebook to organize dataset in structured formats and still utilize the benefit of MapReduce framework. It has its own SQL-like language: HiveQL  {%cite thusoo2010hive --file big-data %} which is easy for anyone who understands SQL. Hive reduces code complexity and eliminates lots of boiler plate that would otherwise be an overhead with Java based MapReduce approach.
 
-Relational interface to big data is good, however, it doesn’t cater to users who want to perform
-
-- ETL to and from various semi or unstructured data sources.
-- advanced analytics like machine learning or graph processing.
-
-These user actions require best of both the worlds - relational queries and procedural algorithms. Pig Latin {% cite olston2008pig --file big-data%}  and Spark SQL {% cite armbrust2015spark --file big-data%}  bridges this gap by letting users to seamlessly intermix both relational and procedural API. Both the frameworks free the programmer from worrying about internal execution model by providing implicit optimization on the user input DAG of transformations.
-
 Pig Latin aims at a sweet spot between declarative and procedural programming. For advanced programmers, SQL is unnatural to implement program logic and Pig Latin wants to dissemble the set of data transformation into a sequence of steps. This makes Pig more verbose than Hive.
 
 SparkSQL though has the same goals as that of Pig, is better given the Spark exeuction engine, efficient fault tolerance mechanism of Spark and specialized data structure called Dataset.
author	msabhi <abhi.is2006@gmail.com>	2016-12-15 01:35:48 -0500
committer	GitHub <noreply@github.com>	2016-12-15 01:35:48 -0500
commit	7d565b86c499491bc18e5fa1c439744eed056007 (patch)
tree	99cb69f11eb66ca462175b4abd093654243cbb06 /chapter
parent	adb64f799c47d47804f0faddec29277ce05b5461 (diff)