aboutsummaryrefslogtreecommitdiff
path: root/chapter/8
diff options
context:
space:
mode:
authormsabhi <abhi.is2006@gmail.com>2016-12-02 05:49:56 -0500
committerGitHub <noreply@github.com>2016-12-02 05:49:56 -0500
commit68b6294cef1fd0f5c4a245ca3206038c824130d8 (patch)
treec4d56420e34d8b03bb1483f5f8075afb17755ec8 /chapter/8
parentf8cf15d4ea7a9ec40bc00aa1a8f4ed0b7eb1c223 (diff)
Update big-data.md
Diffstat (limited to 'chapter/8')
-rw-r--r--chapter/8/big-data.md18
1 files changed, 10 insertions, 8 deletions
diff --git a/chapter/8/big-data.md b/chapter/8/big-data.md
index 1b0fff1..42e68d5 100644
--- a/chapter/8/big-data.md
+++ b/chapter/8/big-data.md
@@ -177,14 +177,16 @@ In Spark SQL, transformation happens in four phases :
STILL WORKING ON THIS..
-## Large Scale Graph processing :
-Map Reduce doesn’t scale easily and is highly inefficient for iterative / graph algorithms like page rank and machine learning algorithms. Iterative algorithms requires programmer to explicitly handle the intermediate results (writing to disks). Hence, every iteration requires reading the input file and writing the results to the disk resulting in high disk I/O which is a performance bottleneck for any batch processing system. <br />
- Also graph algorithms require exchange of messages between vertices. In case of PageRank, every vertex requires the contributions from all its adjacent nodes to calculate its score. Map reduce currently lacks this model of message passing which makes it complex to reason about graph algorithms. <br />
- -`Bulk synchronous parallel` model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.
-
- **Bulk synchronous parallel model**
- This model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.<br />
- In BSP model
+## Large Scale Graph processing
+
+Map Reduce doesn’t scale easily and is highly inefficient for iterative / graph algorithms like page rank and machine learning algorithms. Iterative algorithms requires programmer to explicitly handle the intermediate results (writing to disks). Hence, every iteration requires reading the input file and writing the results to the disk resulting in high disk I/O which is a performance bottleneck for any batch processing system.
+
+Also graph algorithms require exchange of messages between vertices. In case of PageRank, every vertex requires the contributions from all its adjacent nodes to calculate its score. Map reduce currently lacks this model of message passing which makes it complex to reason about graph algorithms.
+
+**Bulk synchronous parallel model**
+
+This model was introduced in 1980 to represent the hardware design features of parallel computers. It gained popularity as an alternative for map reduce since it addressed the above mentioned issues with map reduce to an extent.<br />
+In BSP model
- Computation consists of several steps called as supersets.
- The processors involved have their own local memory and every processor is connected to other via a point-to-point communication.