graphx

author: Connor <cnnrznn@udel.edu> 2016-12-12 22:16:06 -0500
committer: Connor <cnnrznn@udel.edu> 2016-12-12 22:16:06 -0500
commit: bb7b13c11fcfc1f106ad4535a068de7d771627f9 (patch)
tree: d1e57baa020e0121c881585b0b276ac95c6bf85b /chapter/4
parent: c019fd9d7f49168f0bc855de717d710946c032e1 (diff)
1 files changed, 7 insertions, 0 deletions
diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md
index 32d1175..4c5b946 100644
--- a/chapter/4/dist-langs.md
+++ b/chapter/4/dist-langs.md
@@ -323,6 +323,13 @@ Because they are built on top of RDD's, RDG's inherit immutability.
 When a tranformation is performed, a new graph is created.
 In this way, fault tolerance in GraphX can be executed the same way as it is in vanilla Spark; when a fault happens, the series of computations is remembered and re-executed.
 
+A key feature of GraphX is that it is a DSL library built on top of a GPL library.
+Because it uses the general purpose computing framework of Spark, arbitrary MapReduce jobs may be performed in the same program as more specific graph operations.
+In other graph-processing frameworks, results from a graph query would have to be written to disk to be used as input to a general purpose MapReduce job.
+
+With GraphX, if you can structure your application logic as a series of graph operations, an implementation may be created on top of RDD's.
+Because many real-world applications, like social media "connections," are naturally expressed as graphs, GraphX can be used to create a highly scalable, fault-tolerant implementation.
+
 ### Which is best? Why?
 
 MR vs Actors: depends on problem, solution
author	Connor <cnnrznn@udel.edu>	2016-12-12 22:16:06 -0500
committer	Connor <cnnrznn@udel.edu>	2016-12-12 22:16:06 -0500
commit	bb7b13c11fcfc1f106ad4535a068de7d771627f9 (patch)
tree	d1e57baa020e0121c881585b0b276ac95c6bf85b /chapter/4
parent	c019fd9d7f49168f0bc855de717d710946c032e1 (diff)