aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authormsabhi <abhi.is2006@gmail.com>2016-12-02 05:49:56 -0500
committerGitHub <noreply@github.com>2016-12-02 05:49:56 -0500
commit68b6294cef1fd0f5c4a245ca3206038c824130d8 (patch)
treec4d56420e34d8b03bb1483f5f8075afb17755ec8
parentf8cf15d4ea7a9ec40bc00aa1a8f4ed0b7eb1c223 (diff)
Update big-data.md
-rw-r--r--chapter/8/big-data.md18
1 files changed, 10 insertions, 8 deletions
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index 1b0fff1..42e68d5 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -177,14 +177,16 @@ In Spark SQL, transformation happens in four phases :
STILL WORKING ON THIS..
-## Large Scale Graph processing :
-Map Reduce doesn’t scale easily and is highly inefficient for iterative / graph algorithms like page rank and machine learning algorithms. Iterative algorithms requires programmer to explicitly handle the intermediate results (writing to disks). Hence, every iteration requires reading the input file and writing the results to the disk resulting in high disk I/O which is a performance bottleneck for any batch processing system. <br />
- Also graph algorithms require exchange of messages between vertices. In case of PageRank, every vertex requires the contributions from all its adjacent nodes to calculate its score. Map reduce currently lacks this model of message passing which makes it complex to reason about graph algorithms. <br />
- -`Bulk synchronous parallel` model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.
-
- **Bulk synchronous parallel model**
- This model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.<br />
- In BSP model
+## Large Scale Graph processing
+
+Map Reduce doesn’t scale easily and is highly inefficient for iterative / graph algorithms like page rank and machine learning algorithms. Iterative algorithms requires programmer to explicitly handle the intermediate results (writing to disks). Hence, every iteration requires reading the input file and writing the results to the disk resulting in high disk I/O which is a performance bottleneck for any batch processing system.
+
+Also graph algorithms require exchange of messages between vertices. In case of PageRank, every vertex requires the contributions from all its adjacent nodes to calculate its score. Map reduce currently lacks this model of message passing which makes it complex to reason about graph algorithms.
+
+**Bulk synchronous parallel model**
+
+This model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.<br />
+In BSP model
- Computation consists of several steps called as supersets.
- The processors involved have their own local memory and every processor is connected to other via a point-to-point communication.