aboutsummaryrefslogtreecommitdiff
path: root/chapter/4/dist-langs.md
diff options
context:
space:
mode:
authorConnor <cnnrznn@udel.edu>2016-12-12 22:16:06 -0500
committerConnor <cnnrznn@udel.edu>2016-12-12 22:16:06 -0500
commitbb7b13c11fcfc1f106ad4535a068de7d771627f9 (patch)
treed1e57baa020e0121c881585b0b276ac95c6bf85b /chapter/4/dist-langs.md
parentc019fd9d7f49168f0bc855de717d710946c032e1 (diff)
graphx
Diffstat (limited to 'chapter/4/dist-langs.md')
-rw-r--r--chapter/4/dist-langs.md7
1 files changed, 7 insertions, 0 deletions
diff --git a/chapter/4/dist-langs.md b/chapter/4/dist-langs.md
index 32d1175..4c5b946 100644
--- a/chapter/4/dist-langs.md
+++ b/chapter/4/dist-langs.md
@@ -323,6 +323,13 @@ Because they are built on top of RDD's, RDG's inherit immutability.
When a tranformation is performed, a new graph is created.
In this way, fault tolerance in GraphX can be executed the same way as it is in vanilla Spark; when a fault happens, the series of computations is remembered and re-executed.
+A key feature of GraphX is that it is a DSL library built on top of a GPL library.
+Because it uses the general purpose computing framework of Spark, arbitrary MapReduce jobs may be performed in the same program as more specific graph operations.
+In other graph-processing frameworks, results from a graph query would have to be written to disk to be used as input to a general purpose MapReduce job.
+
+With GraphX, if you can structure your application logic as a series of graph operations, an implementation may be created on top of RDD's.
+Because many real-world applications, like social media "connections," are naturally expressed as graphs, GraphX can be used to create a highly scalable, fault-tolerant implementation.
+
### Which is best? Why?
MR vs Actors: depends on problem, solution