update pig

author: Jingjing Ren <renjj@ccs.neu.edu> 2016-12-15 11:02:58 -0500
committer: Jingjing Ren <renjj@ccs.neu.edu> 2016-12-15 11:02:58 -0500
commit: d1ba81f4afc3eece7ade1aeae6e262c6b8a7165e (patch)
tree: f574f5a61c12c151cc17e8ab4aff75fc76c56ad6 /chapter
parent: 1e20be80a76ea452d9f9109b6924860e4e1d6f94 (diff)
1 files changed, 11 insertions, 11 deletions
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index f800de7..1d08292 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -360,16 +360,6 @@ output = FOREACH big_groups GENERATE
             category, AVG(good_urls.pagerank);
 ```
 
-*Word count implementation in PIG*
-
-```
-Ignore the below
- lines = LOAD 'input_fule.txt' AS (line:chararray);
-words = FOREACH lines GENERATE FLATTEN(TOKENIZE(line)) as word;
-grouped = GROUP words BY word;
-wordcount = FOREACH grouped GENERATE group, COUNT(words);
-DUMP wordcount;
-```
 
 *Interoperability* Pig Latin is designed to support ad-hoc data analysis, which means the input only requires a function to parse the content of files into tuples. This saves the time-consuming import step. While as for the output, Pig provides freedom to convert tuples into byte sequence where the format can be defined by users. This allows Pig to interoperate with other existing applications in Yahoo's ecosystem.   
 
@@ -379,8 +369,18 @@ DUMP wordcount;
 
 *Debugging Environment* Pig Latin has a novel interactive debugging environment that can generate a concise example data table to illustrate output of each step.
 
-*Limitations* The procedural design gives users more control over execution, but at same time the data schema is not enforced explicitly, so it much harder to utilize database-style optimization. 
+*Limitations* The procedural design gives users more control over execution, but at same time the data schema is not enforced explicitly, so it much harder to utilize database-style optimization.
 
+*Word count implementation in PIG*
+
+```
+Ignore the below
+ lines = LOAD 'input_fule.txt' AS (line:chararray);
+words = FOREACH lines GENERATE FLATTEN(TOKENIZE(line)) as word;
+grouped = GROUP words BY word;
+wordcount = FOREACH grouped GENERATE group, COUNT(words);
+DUMP wordcount;
+```
 
 
 ### 1.2.3 SparkSQL  :
author	Jingjing Ren <renjj@ccs.neu.edu>	2016-12-15 11:02:58 -0500
committer	Jingjing Ren <renjj@ccs.neu.edu>	2016-12-15 11:02:58 -0500
commit	d1ba81f4afc3eece7ade1aeae6e262c6b8a7165e (patch)
tree	f574f5a61c12c151cc17e8ab4aff75fc76c56ad6 /chapter
parent	1e20be80a76ea452d9f9109b6924860e4e1d6f94 (diff)